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Abstract 

Most recent developments on the stochastic block model (SBM) rely on the knowledge 
of the model parameters, or at least on the number of communities. This paper introduces 
efficient algorithms that do not require such knowledge and yet achieve the optimal 
information-theoretic tradeoffs identified in [AS15] for linear size communities. The 
results are three-fold: (i) in the constant degree regime, an algorithm is developed 
that requires only a lower-bound on the relative sizes of the communities and detects 
communities with an optimal accuracy scaling for large degrees; (ii) in the regime 
where degrees are scaled by w(l) (diverging degrees), this is enhanced into a fully 
agnostic algorithm that only takes the graph in question and simultaneously learns the 
model parameters (including the number of communities) and detects communities with 
accuracy 1 — o(l), with an overall quasi-linear complexity; (iii) in the logarithmic degree 
regime, an agnostic algorithm is developed that learns the parameters and achieves 
the optimal CH-limit for exact recovery, in quasi-linear time. These provide the first 
algorithms affording efficiency, universality and information-theoretic optimality for 
strong and weak consistency in the general SBM with linear size communities. 
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1 Introduction 


This paper studies the problem of recovering communities in the general stochastic block 
model with linear size communities, for constant and slowly diverging degree regimes. In 
contrast to [AS15], this paper does not require knowledge of the SBM parameters. In 
particular, the problem of learning the model parameters is solved when average degrees are 
diverging. We next provide some motivations on the problem and further background on 
the model. 

Detecting communities (or clusters) in graphs is a fundamental problem in networks, 
computer science and machine learning. This applies to a large variety of complex networks 
(e.g., social and biological networks) as well as to data sets engineered as networks via 
similarly graphs, where one often attempts to get a first impression on the data by trying 
to identify groups with similar behavior. In particular, finding communities allows one 
to find like-minded people in social networks [GN02, NWS], to improve recommendation 
systems [LSY03, XWZ+14], to segment or classify images [SM97, SHB07], to detect protein 
complexes [CY06, MPN + 99], to find genetically related sub-populations [PSDOO, JTZ04], 
or discover new tumor subclasses [SPT + 01]. 

While a large variety of community detection algorithms have been deployed in the 
past decades, the understanding of the fundamental limits of community detection has 
only appeared more recently, in particular for the SBM [ColO, DKMZ11, Masl4, MNS14, 
ABH14, MNSb, YC14, AS15]. The SBM is a canonical model for community detection 
[HLL83, WBB76, FMW85, WW87, BC09, KN11, BCLS87, DF89, Bop87, JS98, CK99, 
CI01, SN97, McSOl, BC09, RCY11, CWA12, CSX12], where n vertices are partitioned into 
k communities of relative size pi, i £ [k], and pairs of nodes in communities i and j connect 
independently with probability T !;J . 

Recently the SBM came back to the center of the attention at both the practical level, 
due to extensions allowing overlapping communities [ABFX08] that have proved to fit, well 
real data sets in massive networks [GB13], and at the theoretical level due to new phase 
transition phenomena [ColO, DKMZ11, Masl4, MNS14, ABH14, MNSb]. The latter works 
focus exclusively on the SBM with two symmetric communities, i.e., each community is of the 
same size and the connectivity in each community is identical. Denoting by p the intra- and q 
the extra-cluster probabilities, most of the results are concerned with two figure of merits: (i) 
recovery (also called exact recovery or strong consistency), which investigates the regimes 
of p and q for which there exists an algorithm that recovers with high probability the two 
communities completely [BCLS87, DF89, Bop87, JS98, CK99, CI01, SN97, McSOl, BC09, 
RCY11, CWA12, CSX12, Vul4, YC14], (ii) detection, which investigates the regimes for 
which there exists an algorithm that recovers with high probability a positively correlated 
partition [ColO, DKMZ11, MNS12, Masl4, MNS14]. 

The sharp threshold for exact recovery was obtained in [ABH14, MNSb], showing 1 
that for p = alog(n)/n, q = 61og(n)/n, a, b > 0, exact recovery is solvable if and only if 
sfa — Vb > 2, with efficient algorithms achieving the threshold provided in [ABH14, MNSb]. 
In addition, [ABH14] introduces an SDP proved to achieve the threshold in [BH14, Banl5], 
while [YP14] shows that a spectral algorithm also achieves the threshold. Prior to these, 


1 [MNSb] generalizes this to a, b = 0(1). 
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the sharp threshold for detection was obtained in [Mas 14, MNS14], showing that detection 
is solvable (and so efficiently) if and only if (a — b ) * 2 > 2(a + b), when p = a/n, q = b/n, 
settling a conjecture made in [DKMZ11] and improving on [ColO]. 

Besides the detection and the recovery properties, one may ask about the partial recovery 
of the communities, studied in [MNSa, GV14, Vul4, CRV15, AS15]. Of particular interest to 
this paper is the case of almost exact recovery (also called weak consistency), where only a 
vanishing fraction of the nodes is allowed to be misclassified. For two-symmetric communities, 
[MNSb] shows that almost exact recovery is possible if and only if n{p — q) 2 /(p + q) diverges, 
generalized in [AS15] for general SBMs. 

In the next section, we discuss the results for the general SBM of interest in this paper 
and the problem of learning the model parameters. We conclude this section by providing 
motivations on the problem of achieving the threshold with an efficient and universal 
algorithm. 

Threshold phenomena have long been studied in fields such as information theory (e.g., 
Shannon’s capacity) and constraint satisfaction problems (e.g., the SAT threshold). In 
particular, the quest of achieving the threshold has generated major algorithmic developments 
in these fields (e.g., LDPC codes, polar codes, survey propagation to name a few). Likewise, 
identifying thresholds in community detection models is key to benchmark and guide the 
development of clustering algorithms. Most reasonable algorithms may succeed in some 
regimes, while in others they may be doomed to fail due to computational barriers. However, 
it is particularly crucial to develop benchmarks that do not depend sensitively on the 
knowledge of the model parameters. A natural question is hence whether one can solve 
the various recovery problems in the SBM without having access to the parameters. This 
paper answers this question by the affirmative for the exact and almost exact recovery of 
the communities. 

1.1 Related results on the general SBM with known parameters 

Most of the previous works are concerned with the SBM having symmetric communities 
(mainly 2 or sometimes k), with the exception of [Vul4] which provides some achievability 
results for the general SBM. 2 Recently, [AS15] studied the fundamental limits for the general 
SBM, with results as follows (where SBM(re,p, W ) is the SBM with community prior p and 
connectivity matrix IT). 

I. Partial and almost exact recovery in the general SBM. The first result of [AS15] 
concerns the regime where the connectivity matrix scales as Q/n for a positive symmetric 
matrix Q (i.e., the node average degree is constant). The following notion of SNR is 
introduced 3 


SNR = |A min | 2 /A 


max 


(i) 


2 [GV14] also study variations of the fc-symmetric model. 

2 Note that this in a sense the “worst-case” notion of SNR, which ensures that all of the communities 

can be separated (when amplified); one could consider other ratios of the kind |Aj| 2 /A ma x, for subsequent 
eigenvalues (j = 2,3,...), if interested in separating only subset of the communities. 
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where A m ; n and A max are respectively the smallest 4 * and largest eigenvalue of diag (p)Q. 

The algorithm Sphere-comparison is proposed that solves partial recovery with expo¬ 
nential accuracy and quasi-linear complexity when the SNR diverges, solving in particular 
almost exact recovery. 

Theorem 1. [AS15] Given any k € Z, p E (0, l) fc with \p\ = 1, and symmetric matrix Q 
with no two rows equal, let A be the largest eigenvalue of PQ, and X' be the eigenvalue of 
PQ with the smallest nonzero magnitude. If p := > 4, A 7 < (A') 8 , and 4A 3 < (A') 4 , 

then for some e = e(A, A') and C = C(p,Q) > 0, Sphere-comparison (see Section 3.1) 
detects with high probability communities in graphs drawn from SBM{n , p, Q/n) with accuracy 

1 — 4fce“T6fe/(l — exp(—jjfj; -1^)), provided that the above is larger than 1 — 2 in( 4 fc ) > an< ^ 

runs in 0(n 1+e ) time. Moreover, e can be made arbitrarily small with 8ln(Av / 2/|A / |)/ln(A), 
and C(p,aQ) is independent of a. 

Note that for k symmetric clusters, SNR reduces to j~(a+(k-i)b) > w lhch is the quantity of 
interest for detection [DKMZ11, MNS12], Moreover, the SNR must diverge to ensure almost 
exact recovery in the symmetric case [AS 15]. The following is an important consequence of 
the previous theorem, as it shows that Sphere-comparison achieves almost exact recovery 
when the entries of Q are scaled. 

Corollary 1. [AS15] For any k £ Z, p £ (0, l) fc with \p\ = 1, and symmetric matrix Q with 
no two rows equal, there exists e(5) = 0(l/ln(<5)) such that for all sufficiently large 5 there 
exists an algorithm (Sphere-comparison,) that detects communities in graphs drawn from 
SBM{n,p,5Q) with accuracy 1 — e - Q ( g ) and complexity O n (n 1+e ^). 

II. Exact recovery in the general SBM. The second result in [AS15] is for the regime 
where the connectivity matrix scales as log(n)Q/n, Q fixed, where it is shown that exact 
recovery has a sharp threshold characterized by the divergence function 

D+{f,g) = max ^ (tf(x) + (1 - t)g(x) - /(a)'p(x-) 1- ') , 

named the CH-divergence in [AS 15]. Specifically, if all pairs of columns in diag (p)Q are at 
D+-distance at least 1 from each other, then exact recovery is solvable in the general SBM. 
This provides in particular an operational meaning to a new divergence function analog to 
the KL-divergence in the channel coding theorem (see Section 2.3 in [AS 15] ). Moreover, an 
algorithm (Degree-profiling) is developed that solves exact recovery down to the D + limit 
in quasi-linear time, showing that exact recovery has no informational to computational gap 
(as opposed to the conjectures made for detection with more than 4 communities [DKMZ11]). 
The following gives a more general statement characterizing which subset of communities 
can be extracted — see Definition 3 for formal definitions. 

Theorem 2. [AS15] (i) Exact recovery is solvable in the stochastic block model G 2 (n,p,Q) 
for a partition [k] = U^ =1 M S if and only if for all i and j in different subsets of the partition f 

D+(fPQ)i,{PQ)j)>l, (2) 

4 The smallest eigenvalue of diag(p)Q is the one with least magnitude. 

’The entries of Q are assumed to be non-zero. 
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In particular, exact recovery is information-theoretically solvable in SBM(n,p,Qlog(n)/n) if 
and only if min^-g^^- D + ((PQ)i\\(PQ)j) > 1. 

(ii) The Degree-profiling algorithm (see [AS15]) recovers the finest partition that can be 
recovered with probability 1 — o n (l) and runs in o(n 1+e ) time for all e > 0. In particular, 
exact recovery is efficiently solvable whenever it is information-theoretically solvable. 

In summary, exact or almost exact recovery is closed for the general SBM (and detection 
is closed for 2 symmetric communities). However this is for the case where the parameters 
of the SBM are assumed to be known, and with linear-size communities. 

1.2 Estimating the parameters 

For the estimation of the parameter, some results are known for two-symmetric communities. 
In the logarithmic degree regime, since the SDP is agnostic to the parameters (it is a 
relaxation of the min-bisection), and the parameters can be estimated by recovering the 
communities [ABH14, BH14, Banl5]. For the constant-degree regime, [MNS12] shows that 
the parameters can be estimated above the threshold by counting cycles (which is efficiently 
approximated by counting non-backtracking walks). These are however for a fixed number 
of communities, namely 2. We also became recently aware of a parallel work [BCS15], which 
considers private graphon estimation (including SBMs). In particular, for the logarithmic 
degree regime, [BCS15] obtains a procedure to estimate parameters of graphons in an 
appropriate version of the L 2 norm. This procedure is however not efficient. 

For the general SBM, the results of [AS15] allow to find communities efficiently, however 
these rely on the knowledge of the parameters. Hence, a major open problem is to understand 
if these results can be extended without such a knowledge. 

2 Results 

Agnostic algorithms are developed for the constant and diverging node degrees. These 
afford optimal accuracy scaling for large node degrees and achieve the CH-divergence limit 
for logarithmic node degrees in quasi-linear time. In particular, these solve the parameter 
estimation problems for SBM(n,p,u;(l)(5) without knowing the number of communities. An 
example with real data is provided in Section 4. 

2.1 Definitions and terminologies 

The general stochastic block model SBM(n,p, W) is a random graph ensemble defined on 
the vertex-set V = [n], where each vertex v G V is assigned independently a hidden (or 
planted) label a v in [k] under a probability distribution p = (pi,... ,pf.) on [k], and each 
(unordered) pair of nodes (u, v) £ V X V is connected independently with probability W au>a - V , 
where W au ,a v is specified by a symmetric k x k matrix W with entries in [0,1]. Note that 
G ~ SBM(n,p, W) denotes a random graph drawn under this model, without the hidden 
(or planted) clusters (i.e., the labels a v ) revealed. The goal is to recover these labels by 
observing only the graph. 

This paper focuses on p independent of n (the communities have linear size), W dependent 
on n such that the average node degrees are either constant or logarithmically growing, and 
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k fixed. These assumptions on p and k could be relaxed, for example to slowly growing k, 
but we leave this for future work. As discussed in the introduction, the above regimes for 
W are both motivated by applications, as networks are typically sparse [LLDM08, StrOl] 
though the average degrees may not be small, and by the fact that interesting mathematical 
phenomena take place in these regimes. For convenience, we attribute specific notations for 
the model in these regimes: 

Definition 1. For a symmetric matrix Q E IR+ Xfc ; Gi (n,p,Q) denotes SBM(n,p,Q/n), and 
G 2 (n,p,Q) denotes SBM(n,p,\n(n)Q/n). 

Definition 2. (Partial recovery.) An algorithm recovers or detects communities in SBM(n,p, W) 
with an accuracy of a E [0,1], if it outputs a labelling of the nodes {a'(v),v E V}, which 
agrees with the true labelling a on a fraction a of the nodes with probability 1 — o n (l). The 
agreement is maximized over relabellings of the communities. 

Definition 3. (Exact recovery.) Exact recovery is solvable in SBM(n,p, W) for a community 
partition [k\ = U* =1 M S , where A s is a subset of [k], if there exists an algorithm that takes 
G ~ SBM(n,p, W ) and assigns to each node in G an element of {A \,..., At} that contains 
its true community 6 with probability 1 — o n (l). Exact recovery is solvable in SBM(n,p, W) if 
it is solvable for the partition of [k] into k singletons, i.e., all communities can be recovered. 

The problem is solvable information-theoretically if there exists an algorithm that solves 
it, and efficiently if the algorithm runs in polynomial-time in n. Note that exact recovery 
for the partition [A;] = {i} U ([£;] \ {*}) is equivalent to extracting community i. In general, 
recovering a partition [k] = l_lg =1 A s is equivalent to merging the communities that are in a 
common subset A s and recovering the merged communities. Note also that exact recovery 
in SBM (n,p, W) requires the graph not to have vertices of degree 0 in multiple communities 
with high probability (i.e., connectivity in the symmetric case). Therefore, for exact recovery, 
we focus below on W = ln ^ Q where Q is fixed. 

2.2 Partial recovery 

Our main result in the Appendix (Theorme 6) applies to SBM (n,p,Q/n) with arbitrary Q. 
We provided here a specific instance which is easier to parse. 

Theorem 3 (See Theorem 6). Given 5 > 0 and for any k E Z ; p E (0, l) fc with = 1 
and 0 < 8 < min p,, and any symmetric matrix Q with no two rows equal such that every 
entry in Q k is positive (in other words, Q such that there is a nonzero probability of a path 
between vertices in any two communities in a graph drawn from Gi (n,p,cQ)), there exists 
e(c) = 0(l/ln(c)) such that for all sufficiently large c, Agnostic-sphere-comparison(G, 5) 
detects communities in graphs drawn from G\(n,p,cQ) with accuracy at least 1 — e~^( c ) in 
O n (n 1+e ( c )) time. 

Note that a vertex in community i has degree 0 with probability exponential in c, and 
there is no way to differentiate between vertices of degree 0 from different communities. So, 

°This is again up to relabellings of the communities. 
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an error rate that decreases exponentially with c is optimal. The above gives in particular 
the parameter estimation in the case c = w(l) (see also Lemma 17 in the Appendix). 

The general result in the Appendix yields the following refined results in the fc-block 
symmetric case. 

Theorem 4. Consider the k-hlock symmetric case. In other words, Pi = \ for all i, and 
Qij is a if i = j and (3 otherwise. The vector whose entries are all Is is an eigenvector of 
PQ with eigenvalue and every vector whose entries add up to 0 is an eigenvector 

of PQ with eigenvalue So, A = an d \> = 9 Lz£ anc i (AJ_ = . Then, 

as long as k(a + (k — 1 )/3) 7 < (a — /3 ) 8 and 4 k(a + (k — l)/?) 3 < (a — ft) 4 , there exist a 
constant c > 0 such that Agnostic-sphere-comparison(G, l/k ) detects communities with 
an accuracy of 1 — 0(e~ c( ' a ~ /3 ' >2 /O+O- 1 )^)) for sufficiently large ( a — f3) 2 /(a + ( k — l)/3). 

We refer to Section 4 for an example of implementation with real data. 

2.3 Exact recovery 

Recall that from [AS15], exact recovery is information-theoretically solvable in the stochastic 
block model G 2 (n,p, Q ) for a partition [A] = U* =1 A S if and only if for all i and j in different 
subsets of the partition, 


D+({PQ)i,(PQ)j) > 1- (3) 

We next show that this can be achieved without knowing the parameters. Recall that the 
finest partition is the largest partition of [k] that ensure (19). 

Theorem 5. (See Theorem 7) The Agnostic-degree-profiling algorithm (see Section 
3.2) recovers the finest partition in any G 2 (n,p,Q), it uses no input except the graph in 
question, and runs in o(n 1+e ) time for all e > 0. In particular, exact recovery is efficiently 
and universally solvable whenever it is information-theoretically solvable. 

The proof assumes that the entries of Q are non-zero, see Remark 1 for zero entries. 
To achieve this result we rely on a two step procedure. First an algorithm is developed 
to recover all but a vanishing fraction of nodes — this is the main focus of our partial 
recovery result — and then a procedure is used to “clean up” the leftover graphs using the 
node degrees of the preliminary classification. This turns out to be much more efficient 
than aiming for an algorithm that directly achieves exact recovery. We already used this 
technique in [AS15], but here we also deal with the difficulties resulting from not knowing 
the SBM’s parameters. 

3 Proof Techniques and Algorithms 

3.1 Partial recovery and the Agnostic-sphere-comparison algorithm 

The first key observation used to classify graphs’ vertices is that if v is a vertex in a graph 
drawn from Gi(n,p, Q ) then for all small r the expected number of vertices in community i 
that are r edges away from v is approximately a ■ ( PQ) r e av . So, we define: 
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Definition 4. For any vertex v, let N r (v ) be the set of all vertices with shortest path to v of 
length r. We also refer to the vector with i-th entry equal to the number of vertices in N r (y) 
that are in community i as N r (v). If there are multiple graphs that v could be considered a 
vertex in, let N r \(-n(v) be the set of all vertices with shortest paths in G to v of length r. 

One could probably determine PQ and e a given the values of (PQ) r e av for a few different 
r, but using N r (v) to approximate that would require knowing how many of the vertices 
in N r (v) are in each community. So, we attempt to get information relating to how many 
vertices in N r (v) are in each community by checking how it connects to N r i (v r ) for some 
vertex v' and integer r'. The obvious way to do this would be to compute the cardinality of 
their intersection. Unfortunately, whether a given vertex in community i is in N r (v ) is not 
independent of whether it is in N r i (V), which causes the cardinality of \N r (v) n N r >(v')\ to 
differ from what one would expect badly enough to disrupt plans to use it for approximations. 

In order to get around this, we randomly assign every edge in G to a set E with probability 
c. We hence define the following. 

Definition 5. For any vertices v,v' G G, r,r' G Z, and subset of G’s edges E, let N r r ,i E ] (v ■ 
v') be the number of pairs of vertices (v\,V 2 ) such that v\ G N r [G\E]{ v )> v 2 £ ^r'[G\E]( v> )> 
and (ui, V 2 ) G E. 

Note that E and G\E are disjoint; however, G is sparse enough that even if they were 
generated independently a given pair of vertices would have an edge between them in 
both with probability 0{\). So, E is approximately independent of G\E. Thus, for any 
v\ G N r [G/E\( v ) and V 2 G N r '[G/E]( v> ), (^ 1 ,^ 2 ) £ E with a probability of approximately 
cQa vl ,a V2 /n. As a result, 


N r y[E\(v ■ v') « N r[G \ E] (v) ■ —N r , [G \ E] (v') 

« ((1 - c)PQ) r e av ~((1- c)PQ) r 'e av , 
= c(l-cY +r 'e av -Q(PQY +r 'e (Tv ,/n 


Let Ai,..., A h be the distinct eigenvalues of PQ, ordered so that |Ai| > | A2 1 > ••• > |A^| > 
0. Also define h' so that h' = h if \h Y 0 and h' = h — 1 if = 0. If W* is the eigenspace of 
PQ corresponding to the eigenvalue A*, and P\v t is the projection operator on to W t , then 


N r y [E] (v • v') « c(l - c) r+r 'e av ■ Q{PQY +r 'e c Jn 
c(l - cY +r ' 


n 


Y, CiU.,.) • Q(PQ) r+r ' Y 


— n C) ' +r Y Pw, (e„„) • Q(PQr +r 'Pv, (*.„) 

c(1 Yr— Y • p-wy^'^pwMv,,) 

Ai zIPFyk w+ 1 PwM„) 


P l Pw (e a 


(4) 

(5) 

( 6 ) 

(7) 

( 8 ) 
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where the final equality holds because for all i ^ j, 



= PWi(e* v ) ■ QPWj {^a v ,) 

= Pwi(e* v ) ■ P ^^jPwj ) j 


and since A i 7 ^ A j, this implies that Pwi(^a v ) ■ P 1 P\v :i ( e a v ,) = 0 . In order to simplify the 
terminology, 

Definition 6 . Let Q(v ■ v') = Pw l {e rTv ) ■ P^ 1 Pw i (e CTv ,) for all i, v, and v'. 


Equation (14) is dominated by the A^ +r +1 term, so getting good estimate of the A 2 +r +1 
through X r h t r +1 terms requires cancelling it out somehow. As a start, if Ai > A 2 > A 3 then 

N r+2,r>[E\(v ■ v') ■ N r y [E] (v ■ v') - N; + iy[E] {v ■ v') 




More generally, in order to get an expression that can be used to estimate the A* and fi(v-v'), 
we consider the determinant of the following. 

Definition 7. Let M mr y^{v ■ v') be the mxm matrix such that M mr y^(v ■ v r ) l .j = 
N r +i+j, r '[E]{ v ' v> ) f or eac h * 0 ^^ j- 

To the degree that approximation 8 holds and c is small, each column of M m r r ,r E } (v ■ v') 
is a linear combination of the vectors 



with coefficients that depend only on {Ai,..., A^}. So, by linearity of the determinant in 
one column, det (M m r y^(v ■ v')) is a linear combination of these vector’s wedge products 


with coefficients that are independent of r and r'. By antisymmetry of wedge products, 


only the products that use m different such vectors contribute to the determinant, and the 
products involving the eigenvalues of highest magnitude will dominate. As a result, there 
exist constants 7 (Ai, ..., A m ) and 7 / (Ai, ..., A m ) such that 



if |A m | > |A m+ i|, and 

det(M mr y[ E ](v -v')) 



i =1 


i =1 



)) 








if |A m | = |A m+ i|. These facts suggest the following plan for estimating the eigenvalues 
corresponding to a graph. First, pick several vertices at random. Then, use the fact that 
|A r r [c;\p](u)| ~ ((1 — c)Ai) r for any good vertex v to estimate Ai. Next, use the formulas 
above about det(M m ^,r'[E\ ( v ' v)) to get an approximation of h' and all of PQ’s eigenvalues 
for each selected vertex. Finally, take the median of these estimates. 

Now, note that whether or not |A m | = |A m +i|, we have 


m—1 


det(Mm,r+i,r'[E]{v ■ v')) - (1 - c) m A m+ i \ det (M mjr y [E] (v ■ v')) 


i= 1 


C m M x ,A m -A m+ i 
n m7i i’’"’ mj (! - c) m A 


n((l - c)\y + r'+%{v v') 


i=1 


That means that 


det (M m>r+ iy[E](y • uQ) - (1 - c) m A m+ i fl^i 1 Aj det• v')) 
det(M m _ hr+iy[E] (v ■ v')) - (1 - c) m_1 A m Y\TJi det (M m _ hry[E] (v ■ v')) 
^ C 7(Al, •••, A m ) A m _i(A m , — A m+ i) ((A ^ x ',r+r , +2/- I 

~ /I \ XAV 1 c ) A m) ^m{V ■ V ) 

(1 c)n 7 (Ai, ..., A m _i) A m (A m _i A m ) 


This fact can be used in combination with estimates of PQ's eigenvalues to approximate 
(i(v ■ v') for arbitrary v, v', and i. 

Of course, this requires r and r' to be large enough that 

c(i-cy^ , +1 

n 

is large relative to the error terms for all i < h!. At a minimum, that requires that 
|(1 -c)A ? :r +r ' +1 =oj{n). 

On a different note, for any v and v', 


0 < Pwi{e<j v ~ e a ,) ■ P l Pwi{ea v ~ e 0 


C i(v ■ v ) - 2 (i(v ■ v') + C i(v' ■ v') 


with equality for all i if and only if a v = a v i , so sufficiently good approximations of 
Q(v ■ v),Q(v ■ v') and Q(v' ■ v') can be used to determine which pairs of vertices are in the 
same community. 

One could generate a reasonable classification based solely on this method of comparing 
vertices. However, that would require computing N r y\ E Av ■ v ) for every vertex in the graph 
with fairly large r + r' , which would be slow. Instead, we use the fact that for any vertices 
v, v', and v" with a v = oy / a v », 


Ci(v' ■ v') - 2(i(v ■ v') + (i(v ■ v) = 0 < ( i(v" ■ v") - 2C iiy ■ v") + C i(v ■ v ) 


for all i, and the inequality is strict for at least one i. So, subtracting Q(v ■ v ) from both 
sides gives us that 


Ci(v' ■ v') - 2 (i(v ■ v') < C i(v" ■ v") - 2(i(v ■ v") 
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for all i, and the inequality is still strict for at least one i. 

So, given a representative vertex in each community, we can determine which of them a 
given vertex, v, is in the same community as without needing to know the value of Q(v ■ v). 

This runs fairly quickly if Q(v ■ v') is approximated using N r y^(v' ■ v ) such that r is 
large and r' is small because the algorithm only requires focusing on |A r r /(u)| vertices. This 
leads to the following plan for partial recovery. First, randomly select a set of vertices that 
is large enough to contain at least one vertex from each community with high probability. 
Next, compare all of the selected vertices in an attempt to determine which of them are in 
the same communities. Then, pick one in each community. After that, use the algorithm 
referred to above to attempt to determine which community each of the remaining vertices 
is in. As long as there actually was at least one vertex from each community in the initial 
set and none of the approximations were particularly bad, this should give a reasonably 
accurate classification. 

The risk that this randomly gives a bad classification due to a bad set of initial vertices 
can be mitigated as follows. First, repeat the previous classification procedure several times. 
Assuming that the procedure gives a good classification more often than not, the good 
classifications should comprise a set that contains more than half the classifications and 
which has fairly little difference between any two elements of the set. Furthermore, any such 
set would have to contain at least one good classification, so none of its elements could be 
too bad. So, find such a set and average its classifications together. This completes the 
Agnostic-Sphere-comparison-algorithm. We refer to Section 6 for a detailed version. 

3.2 Exact recovery and the Agnostic-degree-profiling algorithm 

The exact recovery part is similar to [AS 15] and uses the fact that once a good enough 
clustering has been obtained from Agnostic-sphere-comparison, the classification can be 
finished by making local improvements based on the nodes’ neighborhoods. The key result 
here is that, when testing between two multivariate Poisson distributions of means log(n)Ai 
and log(n)A 2 respectively, where Ai, A 2 E Z(j_, the probability of error (of say maximum a 
posteriori decoding) is 


0 ^ n --D+(N,A 2 )-o(i)j ( 9 ) 

This is proved in [AS15]. In the case of unknown parameters, the algorithmic approach is 
largely unchanged, adding a step where the best known classification is used to estimate 
P and Q prior to any step in which vertices are classified based on their neighbors. The 
analysis of the algorithm requires however some careful handling. 

First, it is necessary to prove that given a labelling of the graph’s vertices with an error 
rate of x, one can compute approximations of P and Q that are within 0{x + log (n)/y/n) 
of their true values with probability 1 — o(l). Secondly, one needs to modify the robust 
degree profiling lemma to show that attempting to determine vertices’ communities based 
on estimates of p and Q that are off by at most 5, p' and Q ', and a classification of its 
neighbors that has an error rate of 6 classifies the vertices with an error rate only e °( <51 °g Tl ) 
times higher than it would be given accurate values of p and Q and accurate classifications 
of the vertices’ neighbors. Combining these yields the conclusion that any errors in the 
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estimates of the SBM’s parameters do not disrupt vertex classification any worse than the 
errors in the preliminary classifications already were. 

The Agnostic-degree-profiling algorithm. The inputs are (G, 7 ), where G is a 
graph, and 7 € [0,1] (see Theorem 7 for how to set 7 ). 

The algorithm outputs an assignment of each vertex to one of the groups of communities 
{A \,..., A t }, where A \,..., A t is the partition of [k] in to the largest number of subsets 
such that D + ((pQ)i, ( vQ)j ) > 1 for all 7 j in [k] that are in different subsets (this is called 
the “finest partition”). It does the following: 

( 1 ) Define the graph g' on the vertex set [n] by selecting each edge in g independently 
with probability 7 , and define the graph g" that contains the edges in g that are not in g'. 

(2) Run Agnostic-sphere-comparison on g' to obtain the preliminary classification 
a' E [k] n (see Section 7.1.) 

(3) Determine the size of each alleged community, and the edge density between each 
pair of alleged communities. 

(4) For each node v E [n], determine in which community node v is most likely to belong 
to based on its degree profile computed from the preliminary classification a' (see Section 
7.2.2), and call it a” 

(5) Use a” to get new estimates of p and Q. 

( 6 ) For each node v E [n], determine in which group A\,..., At node v is most likely to 
belong to based on its degree profile computed from the preliminary classification a" (see 
Section 7.2.2). 

4 An example with real data 

We have tested a simplified version of our algorithm on the data from “The political 
blogosphere and the 2004 US Election” [AG05], which contains a list of political blogs that 
were classified as liberal or conservative, and links between the blogs. 

The algorithm we used has a few major modifications relative to our standard algorithm. 
First of all, instead of using N r y{y ■ v') as its basic tool for comparing vertices, it uses a 
different measure, N' rr ,(v ■ v') which is defined as the fraction of pairs of an edge leaving 
the ball of radius r centered on v and an edge leaving the ball of radius r' centered on v' 
which hit the same vertex but are not the same edge. Making the measure a fraction of the 
pairs rather than a count of pairs was necessary to prevent N' rr ,{v ■ v') from being massively 
dependent on the degrees of v and v' , which would have resulted in the increased variance 
in vertex degree obscuring the effects of a v and ay on N' rr ,(v ■ v'). The other changes to 
the definition make the measure somewhat less reliable, but it is still useable as long as the 
average degree is fairly high and v 7 ^ v'. 

Secondly, the version of Vertex-comparison-algorithm we used simply concludes that 
two vertices, v and v' , are in different communities if N' rr ,(y ■ v') is below average and the 
same community otherwise. This is reasonable because of the following facts. For one 
thing, because the normalization converts the dominant term to a constant, N' rr ,[v ■ v') is 
approximately affine in (A 2 /Al ) ^ ’ +^ ’ , Qzip ■ v')/n. Also, as a result of the symmetry between 
communities, ■ v ) is the same for all v. So, (^('c • v) — 2^2 (u • v') + Q2W ■ v') is also 
affine in ((2 (v ■ v'). Furthermore, there are only two possible values of ( 2 (v ■ v') by symmetry, 
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and A 2 > 0, so v ■ v ) — 2^2 (r> ■ v') + C 2 W ■ v') > 0 iff (\2 / ^i) r+r ' (, 2 iy ■ u')/n is below 
average. Finally, because both communities have the same average degree, £i(u, v') is 
independent of v and v' so (i(v ■ v) — 2£i(u ■ v') + £ 1 ( 2 / ' v ') is always 0. The version of 
Vertex-classification-algorithm we used is comparably modified. 

Finally, our algorithm generates reference vertices by repeatedly picking two vertices at 
random and comparing them. If it concludes that they are in different communities and 
they both have above-average degree, it accepts them as reference vertices; otherwise it tries 
again. Requiring above-average degree is useful because a higher degree vertex is less likely 
to have its neighborhood distorted by a couple of atypical neighbors. 

Out of 40 trials, the resulting algorithm gave a reasonably good classification 37 times. 
Each of these classified all but 56 to 67 of the 1222 vertices in the graph’s main component 
correctly. The state-of-the-art described in [CG15] gives a lowest value at 58, with the best 
algorithms around 60, while algorithms regularized spectral methods such as the one in 
[QR13] obtain about 80 errors. 
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Figure 1: Visual representation of the clustering obtained on the Adamic and Glance ’05 
blog data. 


5 Open problems 

The current result should also extend directly to a slowly growing number of communities 
(e.g., up to logarithmic). It would be interesting to extend the current approach to smaller 
sized communities or larger numbers of communities (watching the complexity scaling 
with the number of communities), as well as more general models with corrected-degrees, 
labeled-edges, or overlapping communities (though linear-sized overlapping communities can 
be treated with the approach of [AS 15]). 
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6 The Agnostic-sphere-comparison algorithm in details 

Recall the following motivation and definitions. 

Definition 8. For any vertex v, let N r {y) be the set of all vertices with shortest path to v of 
length r. We also refer to the vector with i-th entry equal to the number of vertices in N r {v ) 
that are in community i as N r (v). If there are multiple graphs that v could be considered a 
vertex in, let N r i G Av) be the set of all vertices with shortest paths in G to v of length r. 

One could probably determine PQ and e a given the values of ( PQ) r e av for a few different 
r, but using N r {y ) to approximate that would require knowing how many of the vertices 
in N r (v) are in each community. So, we attempt to get information relating to how many 
vertices in N r (y) are in each community by checking how it connects to N r '{v') for some 
vertex v' and integer r'. The obvious way to do this would be to compute the cardinality of 
their intersection. Unfortunately, whether a given vertex in community i is in N r (y ) is not 
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independent of whether it is in N r i(v'), which causes the cardinality of | N r (v) n N r >(v')\ to 
differ from what one would expect badly enough to disrupt plans to use it for approximations. 

In order to get around this, we randomly assign every edge in G to a set E with probability 
c. We hence define the following. 

Definition 9. For any vertices v,v' £ G, r,r' £ Z, and subset of G’s edges E, let N r r ,i E \(v ■ 
v') be the number of pairs of vertices (v\,V 2 ) such that v\ £ ^ r [G\B] ( -y )> v 2 £ ^r'[G\E] W)> 
and ( vi,V 2 ) £ E. 

Note that E and G\E are disjoint; however, G is sparse enough that even if they were 
generated independently a given pair of vertices would have an edge between them in 
both with probability O(Aj). So, E is approximately independent of G\E. Thus, for any 
v\ £ N r \c/E](v) and V 2 £ N r '\G/E]W)i ( v u v 2) £ E with a probability of approximately 
cQa vl ,<j V2 /n. As a result, 


N r y[E\(v ■ v') « N r[G \ E] (v) ■ —N r , [G \ E] (v') 

« ((1 - c)PQYe av • ^((1 - c)PQ) r 'e a , 
n v 

= c(l-cY +r 'e av -Q(PQY +r 'e (Tv ,/n 


Let Ai,..., A h be the distinct eigenvalues of PQ, ordered so that |Ai| > |A 2 1 > ••• > \Xh\ > 
0. Also define h' so that h' = h if Xh Y 0 and h' = h — 1 if Xh = 0. If W{ is the eigenspace of 
PQ corresponding to the eigenvalue Aj, and Pw, is the projection operator on to Iff , then 


N r y[E](v • v> ) ~ C (1 “ c Y +r ' e <7v ■ Q{PQY +r ' e <y v ,/n 
c(l -cY +r ' 


n 


Eci,'«(PQr +r ' E^K,: 


— Y~ E F »~.fe.) ■ Q(PQY* r 'Pw,(e c p 

C(1 ~J~ )I+ ' E 

hj 

— --- ^2 A * +r +lp Wi( e *v) • P~ lp Wi(ea v ,) 


( 10 ) 

( 11 ) 

( 12 ) 

(13) 

(14) 


where the final equality holds because for all i Y J, 

hPwY e <y v ) • ( e a v /) = (PQP Wi (e* v )) ■ P~ l Pwj{e<T v ,) 

= Pwi(e av ) ■ QPwj( e v v /) 

= P[Vi(e< Tv ) ■ p- l \jP Wj (e<T „/) 5 


and since A* Y A j, this implies that Pw, Ya v ) ■ P 1 Pw :i ( e a v ,) = 0. In order to simplify the 
terminology, 
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Definition 10. Let Q(v ■ v') = P\\/ i (e<r„ ) ■ P 1 Pw i (,o rTv ,) for all i, v, and v'. 

Equation (14) is dominated by the +1 term, so getting good estimate of the A 2 +r +1 
through X'fJ~ r +1 terms requires cancelling it out somehow. As a start, if Ai > A 2 > A 3 then 


N r+2,r>[E](v ■ v’) ■ N r y [E] (v ■ v’) - N; + iy[E] {v ■ v’) 

^2/1 _ ^'\2r+2r / +2 

« —--(A? + - 2AiA 2 )A^ +1 A^ +r +1 (i(v ■ v')( 2 (v ■ v') 


Note that the left hand side of this expression is equal to det 


N r y[ E ] (V ■ V') N r+1 y [E] (v ■ v') 


N r+ 1 ,r'{E](v ■ V A 


N r+2 y[E](v-v" 


More generally, in order to get an expression that can be used to estimate the A* and Q(v ■ v 1 ), 
we consider the determinant of the following. 


Definition 11. Let M mr y^{y ■ v') be the m x m matrix such that M mr y^{y ■ v')ij = 
N r+i+ jy[ E ](v ■ v') for each i and j. 

To the degree that approximation 8 holds and c is small, each column of M m r y\ E \ (v ■ v') 
is a linear combination of the vectors 

c( l ~f +r Ci(v • v')X\ +r ’[l, Ai, Xl ..., Ar Y 

with coefficients that depend only on {Ai,..., A h}- So, by linearity of the determinant in 
one column, det {M mr y^{v ■ v')) is a linear combination of these vector’s wedge products 
with coefficients that are independent of r and r'. By antisymmetry of wedge products, 
only the products that use m different such vectors contribute to the determinant, and the 
products involving the eigenvalues of highest magnitude will dominate. As a result, there 
exist constants y(Ai,..., A m ) and 7 / (Ai, ..., A m ) such that 


det (M m , r y[ E \(v ■ v')) 


c m ( 1 


_ c yn(r+r') 
n™ 


7(Ai, A m ) PI A[ +r ’ ,+ 1 Cj('u • v') 

i= 1 


if |A m | > |A m+ i|, and 

det (M mtr y[E](v ■ v')) 


yn (1 „\m(r+r') m 1 

c -^ - 


i —1 


(7(Ar, X m )XT' +1 Cm(v • v') + 7 '(Ai, X m )X r +^ +1 C m+l {v ■ v')) 


if |A m | = |A m _|_i|. In the later case, X m = —A m +i, so either 7 (Ai, ..., X rn )X r r f[ r ' +l and 
7 , (Ai,...,A m )A ^ +1 have the same sign or 7 (Ai, ..., X m )X r 1 + r ' +2 and 7 7 (Ai, ..., A m )A £+ r ' +2 
have the same sign. In any of these cases, 

c m ( 1 — c \ m P+ r ') ™ , 

max(| AetM m r y E] (y • u)|, | det M m ^ r+l y E] {v ■ u)|) « --—- |Ai| r+r +1 

i=l 
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This suggests the following algorithm for finding PQ 1 s eigenvalues. 


The Basic-Eigenvalue-approximation-algorithm The inputs are ( E,c,v ), where v is 
a vertex of the graph, c £ (0,1), and E is a subset of G’ s edges. 

The algorithm ouputs a claim about how many distinct nonzero eigenvalues PQ has and 
a list of approximations of them. 

(1) Compute N r i G \ E Uv) for each r until |-/V r ,r G \ B i(u)| > y/n, and then set X” = 2 {/n/{l— c). 

(2) Set r = r' = | logn/log((l — c)A") — \An n. Then, compute 


n max(| det M m .,r,r[E] (v • ^)|, 1 det M m ^ r+iyE] {v • ^)1) 
cmax(| detM m _ l r)r[E] (u • u)|, | det M m _ l r+iyE] (v • u)|) 


until an m is found for which this expression is less than ((1 — c)A / 1 / ) 3 / 4 + , 1 
h" = m — 1 . 

(3) Then, set 


Then, set 


_ i —1 

|A'| = —- \J det(Mj, r+3ir ./[£](u • u'))/ det(M i)r+lir , [E] (u • i/))/ JJ(1 - c)|A' 


i=i 


unless | det(Mi, r+iy[E] (v-v'))\ < ^J\det(M yy[E] (v • u'))| • | det(M i)r+ 2 )r /[^(v • i/))|, in which 
case set 

^ _ i —1 

|A'| = —-^/det (Mi 

,r+‘2,r' [E](v ■ v '))/det (Mi ,r,r' [E]{w'))/ n i^i 

3 = 1 

Repeat this for each i < h" 

(4) Next, for each i < h" , if ||A'| - |A' +1 || < then set A- = |A'| and A' +1 = —|A- +1 |. 

For each i < h" such that ||A'| - |A' +1 || > ^ and 11A'_ x | - |A'|| > ^ set 

1 T~r 

A i = —— det (M itT+iy[E] (v ■ v'))/ det {M yy[E] (v ■ v'))/ A'- 

C 3=1 

(5) Return (A' l5 ..., \' h „) 


The risk when using this algorithm is that if the set of edges in u’s immediate neighborhood 
is sufficiently atypical it may not work correctly. This can be solved by repeating it for 
several vertices and taking the median estimates. 

The Improved-Eigenvalue-approximation-algorithm The input is c £ (0,1) 

The algorithm ouputs a claim about how many distinct nonzero eigenvalues PQ has and 
a list of approximations of them. 

(1) Create a set of edges E, that each of G’s edges is independently assigned to with 
probability c. 

(2) Randomly select VTi in of G’s vertices, u[l], u[2],..., v[y/\n n]. 
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(3) Run Basic-Eigenvalue-approximation-algorithm(E,c,v[i]) for each i < Vlnn, stopping 
the algorithm prematurely if it takes more than 0(n\/lnn) time. 

(4) Return (A ^,\' h „) where h" and A i are the median outputs of the executions of 
Basic-Eigenvalue-approximation-algorithm for each i. 


Now, note that whether or not |A m | = |A m+ i|, we have 


m— 1 


d.Gt (^■^■ r n,r-\-l,r'[E] (V ^ )) ^m-\- 1 11 ,r,r' 

i= 1 

- S 7 ( Al - -'^ n-Jrxl fit * 1 - ^ r + "‘ +2 - ”') 

' 7 i=1 


That means that 

det(Mm >r +iy[E](v • ^)) - (1 - c) m A m+ i flSa 1 Aj det(M mi? . ;T ./ [£ ;](u • ?/)) 

det(M m _ hr+1 y [E] (v ■ v')) - (1 - hr-Um n^i 2 Aj det(M m _ 1)r)r / [jB] (t; • i/)) 

_ c 7 (Ai > A in) A m _i(A m A m ^-i) / ^ N N, r _|_ r / _|_2 a 

~ T\ mm \ V - ; r\ \ r -— c )^m) c, m\y-v j 

(1 cjn 7 (Ai, A m _i) A m ,(A m _i A m J 


This fact can be used in combination with estimates of PQ' s eigenvalues to approximate 
(i(v ■ v') for arbitrary v, v\ and i as follows. 

The Vertex-product-approximation-algorithm The inputs are 
(v, v ', r, r 1 , E, c, (A^,..., A^,,)), where v, v' are vertices, r, r' are positive integers, E is a subset 
of G’s edges, c € (0,1), and A' G IR for all i. It is assumed that N r "[G\E]{ v ) has already been 
computed for r" < r + 2 h" + 3 and that N r n\G\ E -\{v') has already been computed for r" < r'. 

The algorithm outputs (z\{v ■ v'), ..., z E /(v ■ v')) such that Zi(v ■ v') ~ Q(y ■ v') for all i. 

(1) For each i < h", set 


Zi(v ■ v') 


det {Mi, r+ iy[ E ](v ■ v ') - (1 
det(Mj_ l r+l r / [B] (v ■ v') - (1 
n(A'_! - A')7({(1 - c)A'-},*- 
'cAUlA'-A^haa-c)^}, 


- c YK+i n}=i A'- det (M i}r y [E] (v ■ v') 

~ c) i_1 A- Il}= 2 i A !j det(M i _ 1|r>r / [B] (u • v') 

A|((l - c)Ap- r - ’•'- 1 


(2) Return (z\(y ■ v'), ...,Zh"{v ■ v')). 


Of course, this requires r and r' to be large enough that 

^r+r' 

n 
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is large relative to the error terms for all i < h!. At a minimum, that requires that 
|(1 — c)Ai| r+r,+1 = w(n), so 

r + r'> log(n)/log((l - c)|A^|). 

On the flip side, one also needs 


r,r' < log(n)/log((l — c)Ai) 

because otherwise the graph will start running out of vertices before one gets r steps away 
from v or r' steps away from v'. 

Furthermore, for any v and v\ 

0 < Pwi( e <r v - e <r v i) ■ P~ 1 PwM<j v ~ e<r v ,) 

= C i(v ■ V ) - 2C i(v ■ v') + C i(v' ■ v') 

with equality for all i if and only if a v = cr v i , so sufficiently good approximations of 
(i(v ■ v), Q(v ■ v') and Q{v' ■ v') can be used to determine which pairs of vertices are in the 
same community as follows. 

The Vertex-comparison-algorithm. The inputs are (v, v', r, r ’, E, x, c, (A^,..., Ai „)), where 
v, v' are two vertices, r, r' are positive integers, E is a subset of C s edges, x is a positive 
real number, c is a real number between 0 and 1, and (A(,..., Al„) are real numbers. 

The algorithm outputs a decision on whether v and v' are in the same community or 
not. It proceeds as follows. 

(1) Run Vertex-product-approximation-algorithm(v,v’,r,r’,E,c,( A^,..., A( ( „) j, Vertex-product- 

approximation-algorithm(v,v,r,r’,E,c,(\ j, and Vertex-product-approximation- 

algorithm(v v r, r ’,E, c, (A^,..., \' h „)). 

(2) If 3i : Zi(v ■ v ) — 2 Zi(v ■ v') + zi{v' ■ v') > 5(2.x(min pj)" 1 ^ 2 + x 2 ) then conclude that 
v and v' are in different communities. Otherwise, conclude that v and v' are in the same 
community. 


One could generate a reasonable classification based solely on this method of comparing 
vertices (with an appropriate choice of the parameters, as later detailed). However, that 
would require computing N r r ir E Vv ■ v ) for every vertex in the graph with fairly large r + r ', 
which would be slow. Instead, we use the fact that for any vertices v, v', and v" with 

(J v — (J v f (j v " , 


C i{v' ■ v') - 2(iiy ■ v') + c i(v ■ v) = 0 
< C i(v" ■ v") - 2 Ci(v ■ v") + (i(v ■ v ) 

for all i, and the inequality is strict for at least one i. So, subtracting Q(v ■ v ) from both 
sides gives us that 


Ci(v' ■ v') - 2 Ci(v ■ v') < C i(v" ■ v") - 2 Ci(v ■ v") 
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for all i, and the inequality is still strict for at least one i. 

So, given a representative vertex in each community, we can determine which of them a 
given vertex, v, is in the same community as without needing to know the value of Q(v ■ v ) 
as follows. 

The Vertex-classification-algorithm. The inputs are (y [], v', r, r', E, c, A^,..., AC,,)), where 
n[] is a list of vertices, v’ is a vertex, r,r' are positive integers, E is a subset of G’s edges, 
c is a real number between 0 and 1, and (A^,..., \' h „) are real numbers. It is assumed that 
z t (v[a] ■ n[cr]) has already been computed for each i and a. 

The algorithm is supposed to output a such that v' is in the same community as v[a]. It 
works as follows. 

(1) Run Vertex-product-approximation-algorithm(v [<7],U,r,r’ ; £',c,(A / 1 ,..., \' h n)) for each cr. 

(2) Find a a that minimizes the value of 

max Zi(v[o] ■ u[c]) — 2zi(v[a] ■ v') — (zi(v[a'] ■ v[a']) — 2zi{v[a'] ■ v')) 

cr' f cr.i<h" 

and conclude that v' is in the same community as v[a}. 


This runs fairly quickly if r is large and r' is small because the algorithm only requires 
focusing on N r i(y') vertices. This leads to the following plan for partial recovery. First, 
randomly select a set of vertices that is large enough to contain at least one vertex from each 
community with high probability. Next, compare all of the selected vertices in an attempt to 
determine which of them are in the same communities. Then, pick one anchor vertex in each 
community. After that, use the algorithm above to attempt to determine which community 
each of the remaining vertices is in. As long as there actually was at least one vertex from 
each community in the initial set and none of the approximations were particularly bad, this 
should give a reasonably accurate classification. 

The Unreliable-graph-classification-algorithm. The inputs are ( G , c, m, e, x, (A^,..., AC,)), 
where G is a graph, c is a real number between 0 and 1, m is a positive integer, e is a real 
number between 0 and 1, x is a positive real number, and (A^,..., \' h „) are real numbers. 

The algorithm outputs an alleged list of communities for G. It works as follows. 

(1) Randomly assign each edge in G to E independently with probability c. 

(2) Randomly select m vertices in G, u[0], ...,v[m — 1]. 

(3) Set r = (1 — |) logn/log((l — c)A / 1 ) — \Ann and r' = y • log?r/log((l — c)A / 1 ) 

(4) Compute N r n^ G \ E ^ (u['i]) for each r" < r + 2 h" + 3 and 0 < i < m. 

(5) Run vertex-comparison-algorithm^ i^v^jrp-’jEjX,^,..., \' h „)) for every i and j 

(6) If these give consistent results, randomly select one alleged member of each community 
v'[a\. Otherwise, fail. 

(7) For every v" in the graph, compute N r u\Q\ E -\(y") for each r" < r'. Then, run 
Vertex-classification-algorithm(V [],v”, r^’jE^A^,..., \' h „)) in order to get a hypothesized 
classification of v" 
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(8) Return the resulting classification. 

The risk that this randomly gives a bad classification due to a bad set of initial vertices 
can be mitigated as follows. First, repeat the previous classification procedure several times. 
Assuming that the procedure gives a good classification more often than not, the good 
classifications should comprise a set that contains more than half the classifications and 
which has fairly little difference between any two elements of the set. Furthermore, any such 
set would have to contain at least one good classification, so none of its elements could be 
too bad. So, find such a set and average its classifications together. 

So, the overall Agnostic-sphere-comparison-algorithm starts by estimating PQ : s 
eigenvalues. Then, it uses those estimates to pick appropriate values of x and e for the 
Unreliable-graph-classification-algorithm. Finally, it runs it several times and combines the 
resulting classifications as explained above. The only inputs it requires are the graph itself 
and some 6 > 0 such that pi > 5 for all i. 


The Reliable-graph-classification-algorithm (i.e., Agnostic Sphere comparison). The 

inputs are ( G,m,S,T(n )), where G is a graph, m is a positive integer, 6 is a positive real 
number, and T is a function from the positive integers to itself. 

The algorithm outputs an alleged list of communities for G. It works as follows. 

(1) Run Improved-Eigenvalue-approximation-algorithm^.!) in order to compute (A),..., X' h „) 

(2) Let X" = \[ + 2In~ 3//2 (n), A/„ = X' h „ — 21n _3 / 2 (n), and k! = [1 / AJ 

(3) Let x be the smallest rational number of minimal numerator such that 






X x h" ) 


l _ m _|_ m . 2k'e 16A l(fc , ) 3/2 (TO“ 1/2 +z) / 1 _ g 16A' , (fe') 3 / 2 ((deita)- 1 /2 +a: ) U 4(A") 


-)-l) 



(4) Let e be the smallest rational number of the form - or 1 —\ such that (2(A / 1 / ) 3 /(A/,,) 2 ) 1 e / 3 < 
A" and (1 + e/3) > log(A") / log ((A'/„) 2 / 2 A'/) 

(5) Let c be the largest unit reciprocal less than 1/9 such that all of the following hold: 

(l-c)(A'/„) 4 >4(A") 3 
(2(1 — c )(Ai) 3 /(A//) 2 ) 1_e/3 < (1 — c)X'{ 

(1 + e/3) > log((l - c)A?)/log((l - c)(A'//) 2 /2A / 1 / ) 

P(l-c)(\'fi) 2 S / P(l~c )(\"„) 2 a ..P-cHA",,) 4 

_ S) m + m ■ 2k' e 16A i' (fe , ) 3/2 ((<5) _1/2 +^) j M _ e i6A' 1 '(fc') 3 / 2 ((«)- 1 /2 +a! ) 4 (a") 3 ’ ’ 

(6) Run Unreliable-graph-classification-algorithm(G,c,m, e,x, (A^,..., A/,,)) T(n) times 
and record the resulting classifications. 

(7) Find the smallest y" such that there exists a set of more than half of the classifications 
no two of which have more than y" disagreement, and discard all classifications not in the 
set. In this step, define the disagreement between two classifications as the minimum 
disagreement over all bijections between their communities. 

(8) For every vertex in G, randomly pick one of the remaining classifications and assert 
that it is in the community claimed by that classification, where a community from one 
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classification is assumed to correspond to the community it has the greatest overlap with in 
each other classification. 

(9) Return the resulting combined classification. 

If the conditions of theorem 2 are satisfied, then there exists 5 such that 


Reliable-graph-classification-algorithm(G, ln(4|_l/<5J)/5, 5, Inn) 


classifies at least 


1 - 


6 ke 


Cp 2 

4fc 




(15) 


of G ’s vertices correctly with probability 1 — o(l) and it runs in 0(n 1+e ) time, for appropriate 
C and p = \J A 2 ,/4Ai. 


7 Appendix 


7.1 Partial Recovery 

7.1.1 Formal results 


Theorem 6. For any 5 > 0, there exists an algorithm (Agnostic-sphere-comparison,) 
such that the following holds. Given any k € Z, p £ (0, l) k with \p\ = 1, and symmetric 
matrix Q with no two rows equal, let A be the largest eigenvalue of PQ, and X' be the 
eigenvalue of PQ with the smallest nonzero magnitude. For any x, x’, and e such that x is 
either a unit reciprocal or an integer, e is a rational number of the form ( or 1 — and all 
of the following hold: 


n 2 \t2 

.yx A min 


n 2\/2 
.yx A minp^ 


__ (7 • 9A , ) ) 1 ) 

2ke 1 6 Afc3 / 2 (( min Pi) _1 / 2 + a: ) / | 1 — g 16Afc 3 / 2 ((minp i )- 1 / 2 +r ! :) U 4A 3 ' 


1 

<2 


.9(A'72 ) 4 > A' 
0 < x < x' < 


A k 


X' min pj 
(2A 3 /A ,2 ) 1 - £ / 3 < A 
(1 + e/3) > log(A)/ log(A /2 /2A) 

13(2x / (minp j ) _1/2 + 7') 2 ) < min(u;i({u}) - Wi({v'})) ■ P -1 ^^}) - Wi({v'})) 
Every entry of Q k is positive 

3w e R k such that QPw = Xw, w ■ Pw = 1, and x < mmwi/2. 

5 < min 

2 /2 / 2 t2 /4 \ 

8 In(4 [1 /S\) [1 /8\ e~ i6ali/sj 3 / 2 (5-i/ 2 + x) / ( 1 - e " i6xyi/sp/ 2 ^/ 2 + x ) ' (( ^ )_1) ] < S 


n / 2,/2 
.yx A minp^ 


_ g_ * ^ 111111 yj _// .9A \ 

min Pi > 8ke 16xk3 ^ 2 (( min Pi)~ 1 ^ 2 + x ') / I 1 — e 16 Afc 3 / 2 ((mmp i )- 1 / 2 +:I: , ) 4a3 
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With probability 1 — o( 1) ; the algorithm runs in 0{n l+ s e logn) time and detects com¬ 
munities in graphs drawn from Gi (n,p, Q ) with accuracy at least 1 — 3 y' without any input 
beyond 5 and the graph, where 

_ -9x ,2 X' 2 min Pi / _ _ x' 2 X 12 min Pi _/, ,9A /4 \ j -A 

y' — 2ke 16Afc 3 /^((minp^)~ 1 /^+3: / ) j | ^ _ g ' 16Afc 3 / 2 ((min Pi ) ~ 1 / 2 +x') ^ 4A3 ' ‘ J 


Considering the way <5, e, x, and x' scale when Q is multiplied by a scalar yields the 
following corollary. 

Corollary 2. For any k E Z, p E (0, \) k with \p\ = 1, and symmetric matrix Q with no two 
rows equal such that Q k has all positive entries, there exist e(c) = 0(1/ ln(c)) such that for 
all sufficiently large c, Agnostic-sphere-comparison detects communities in graphs drawn 
from Gi (n,p,cQ) with accuracy at least 1 — e~ n ^ in O n (n 1+e ^). 

If instead of having constant average degree, one has an average degree which increases 
as n increases, one can slowly reduce b, 5, and e as n increases, leading to the following 
corollary. 

Corollary 3. For any k G Z, p E [0, l] k with \p\ = 1, symmetric matrix Q with no two 
rows equa such that Q m has all positive entries for sufficiently large ml, and c(n) such that 
c = w(l), Agnostic-sphere-comparison detects the communities with accuracy 1 — o(l) in 
Gi(n, p, c(n)Q) and runs in o(n 1+e ) time for all e > 0. 

These corollaries are important as they show that if the entries of the connectivity 
matrix Q are amplified by a coefficient growing with n, almost exact recovery is achieved by 
(Agnostic-sphere-comparison) without parameter knowledge. 

7.1.2 Proof of Theorem 6 

Proving Theorem 6 will require establishing some terminology. First, let Ai,..., A^ be the 
distinct eigenvalues of PQ, ordered so that |Ai| > |Aa| > > |A^| > 0 and if |A/ = |Aj+i| 

then A* > 0 > Aj+i. Also define h’ so that h! = h if A h 7 ^ 0 and h! = h — 1 if A/j = 0. In 
addition to this, let d be the largest sum of a column of PQ. 

Definition 12. For any graph G drawn from Gi (n,p,Q) and any set of vertices in G, V, 
let be the vector such that I is the number of vertices in V that are in community i. 
Define w\(V), W 2 (V), ..., Wh(V ) such that = ]> f^WiiV ) and Wi(V) is an eigenvector of 
PQ with eigenvalue A* for each i. 

wi(V), ..., u’h{V ) are well defined because IR fc is the direct sum of PQ 1 s eigenspaces. The 
key intuition behind their importance is that if V' is the set of vertices adjacent to vertices 
in V then V' ~ PQ~\f, so Wi{V') ~ PQ ■ Wi(V) = A iWi(V). 

Definition 13. For any vertex v, let N r (v) be the set of all vertices with shortest path to v 
of length r. If there are multiple graphs that v could be considered a vertex in, let ATg/i (v) 
be the set of all vertices with shortest paths in G' to v of length r. 
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We also typically refer to N r ^(v] as simply N r r G n(v), as the context will make it clear 
whether the expression refers to a set or vector. 

Definition 14. A vertex v of a graph drawn from G±(n,p,Q) is (R, x)-good if for all 
0 < r < R and w £ R k with w ■ Pw = 1 

I W ■ N r+ i(v) - w ■ PQN r (v)\ < 

and ( R,x)-bad otherwise. 

Note that since any such w can be written as a linear combination of the e*, v is 
(R, x)-good if | e% ■ N r+ i(v) — ei ■ PQN r (v)\ < yJPi/k f° r all 1 < i < k and 

0 < r < R. 

Lemma 1. If v is a (R, x)-good vertex of a graph drawn from Gi (n,p,Q), then for every 
0 < r < R, |iV r (u)| < A^v / fe((minpi) _1,/2 + x). 

Proof. First, note that for any eigenvector of PQ, w, and r < R, 

| • N r+1 (v) - (P-\v) • PQN r (v) | < ^ (j&\ Vw ■ P-'w 

So, by the triangle inequality, 

| (P~ 1 w) • iV r+ i(u)| < \(P~ l PQw) ■ N r (v)\ + Vu; • P^w 

/\ \r+l 

< Ai|(P _1 u;) • iV r (?;)| + x f ~y J Viv ■ P~ l w 
Thus, for any r < R, it must be the case that 

r / \ \ r ' 

|(-P _1 it;) • iV r (f)| < Ai|(P _1 w;) • iVo(u)| + y A' 1 _r • x ( -y j Vw ■ P~ l w 
< K [\Wo v /Pa v \ + xVw • P-%) 

Now, define w\,..., Wh such that PQwi = A iWi for each i and p = Y^=\ w i • For an y 

A iWi ■ P~ 1 Wj = ( PQwi ) • P~ 1 Wj 
= Wi ■ P~ l PQwj 
= XjWi ■ P~ 1 wj 

If i / j, then A i A j, so this implies that Wi ■ P~ 1 Wj = 0. It follows from this that 

y Wi • p~ i wi = y vj t ■ p~ i Wj 

i i,j 

= (?“*) 

= p ■ P~ 1 p = 1 
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Also, for any i. it is the case that 

\(m)<T„/P<T v \ < \! (™i)a v ■ Pal ■ (Wi)aJyfPaf ^ (min p^ 1/2 y/wi ■ P^Wi 
Therefore, for any r < R, we have that 

\N r (v)\ = {(p-'p) ■ N r (v)\ 

<^KP-V)-iV r (r,)| 

i 

< K Y K^WfvJ + x^YV^i ■ p-'vh 

i i 

< AJ’v / A:((minpi ) _1 ' /2 + x) 


□ 


The following two lemmas are proved in [AS 15]. 

Lemma 2 . Let k E Z, p E (0, l) fc with \p\ = 1 , Q be a symmetric matrix such that Af, > 4A 3 , 
and 0 < x < t „ . TTjen f/iere exists 

^h' Vi 


V 1 


X h' min Pi 


y <C 2fce 16A^fc^/^((minpp ^/^+x) j | ^ _ g 16Aj *4/2 ((minpp */^+a?) 




and, R(n) = ui( 1) sac/i that at least 1 — y of the vertices of a graph drawn from Gi (n,p,Q) 
are (R(n),x)-good with probability 1 — o(l). 

Lemma 3. Let k E Z, p E (0, l) fc with \p\ = 1, Q be a symmetric matrix such that Af, > 4A 3 , 
P(n) = w(l), and e > 0 snc/i that (2Af/A 2 ,) 1-€ / 3 < Ai- A vertex of a graph drawn from 
G(p,Q,n ) is (R(n),x)-good but ( \ n c /^ Inn, x)-bad with probability o(l). 

Definition 15. For any vertices v,v' E G, r,r' E Z, and subset of G’s edges E, let 
N r y[E]( v ' tO fre ifie number of pairs of vertices (v\,V 2 ) such that v\ E N r [G\E]( v )> V 2 E 
N r'[G\E](v'), and (vi,v 2 ) E E. 

Note that if (v) and N r riQ\ E i(v') have already been computed, N r yr E i(v ■ v') can 

be computed by means of the following algorithm, where E[v] = {v' : ( v,v') E E} 


Compute-AT r ^'[E]( v ■ v'): 
for vi E N r , [G \ E] (v'): 
for V2 E E[v i] : 

if W2 G N r[G \ E] (v) : 
count=count+1 

return count 


Note that this runs in 0((d+ l)|A r ./r< 3 \Rj( , t/)|) average time. The plan is to independently 
put each edge in G in E with probability c. Then the probability distribution of G\E will 
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be Gi (n,p, (1 - c)Q), so N r[G \ E] (v) « ((1 - c)PQ) r e (Jv and N r , [G \ E] (v') « ((1 - c)PQ) r 'e v . 
So, it will hopefully be the case that 

N ry[E] (vv , )^((l-c)PQ) r e (7v -cQ((l-c)PQY'e avr /n = c(l-cY +r 'e <Jv -Q(PQY +r 'e av ,/n. 
More rigorously, we have that: 

Lemma 4. Choose p, Q, G drawn from Gi (n,p,Q), E randomly selected from G’s edges 
such that each of G’s edges is independently assigned to E with probability c, and v,v' £ G 
chosen independently from G’s vertices. Then with probability 1 — o(l), 

\N r y[ E ](v ■ v') - N r[G \ E] (v) ■ cQN r , [G \ E] (v')/n\ < (1 + ^|-/V r[G \ B ](u)| • \N r , [G \ E] (v')\/n) log n 

Proof. Roughly speaking, for each v\ £ N r [G\E](. v ) and v 2 £ N r '[G\E]W)-, (ui, U 2 ) £ E with 
probability cQ avit(7v2 /n. This is complicated by the facts that ( 17 , ui) is never in E and no 
edge is in G\E and E. However, this changes the expected value of N r yt E ](v ■ v') given 
G\E by at most a constant unless G has more than double its expected number of edges, 
something that happens with probability o(l). Furthermore, whether ('(7 , v 2 ) is in E is 
independent of whether (v \, v' 2 ) is in E unless (v \, v' 2 ) = ('<7 , V 2 ) or (v [, vf) = (v 2 ,vi). So, 
the variance of N r yr E i(v ■ v') is proportional to its expected value, which is 

0(|A / r[G\£](T)l • \ N r'[G\E](v')\/n). 

N r y[E\{y ■ v’) is within log?r standard deviations of its expected value with probability 
1 — o(l), which completes the proof. □ 

Note that if v is an eigenvector of (1 — c)PQ, \fPQlt is an eigenvector of the symmetric 
matrix ( 1 — c)\^PQV~P. So, since eigenvectors of a symmetric matrix with different eigenvalues 
are orthogonal, we have 

N r[G\E]{v) ■ cQN r , [G \ E] (v’)/n = ^^2wi(N r[G \ E ]{v)) ■ Qwi(N r , [G \ E] (v')) 

i 

Lemma 5 (Determinant Lemma). Let 0 < c < 1, x > 0, G be drawn from G\(n,p, Q), E be 
a subset of G’s edges that independently contains each edge with probability c, and m £ Z + . 
For any v, v 1 £ G and r > r 1 £ Z + , such that ((1 — c)\\,/2) r+r ' > A^ +r n let M m r y^ E j(v ■ v') 
be the m x m matrix such that M m r r i\ E Av ■ v')ij = N r+i+ jyi E Uv ■ v') for each i and j. 
There exist 7 = 7((1 — c)Aj, m) and 7 ' = 7 / ((l — c)A*, m) such that 7 is nonzero and for any 
r, r', and vertices v, v' £ G, then with probability 1 — o(l), either v is (r + 2m + 1 ,x)-bad, v' 
is (r ' + 1 , x)-bad, or 

( .m ra—1 

I det {Mm,r,r'[E\{v ■ v')) -— Wi{N r[G \ E] (v)) ■ Qwi(N r , [G \ E] (v')) 

n i =1 

• ( 7 w m (N r [ G \ E] (v)) ■ Qw m (N r , [G \ E] (v')) + 7 / u> m+ i(N: r [ G \ i 7 ;](u)) • Qw m+ i(N r , [G \ E] (v')))\ 

m m ~ 1 

<—ln m+1 n(l-cr^\\ m+2 \ r+r ' n lAil^ 

1=1 

m m—2 

+ ^ ln m+1 n(l - c) m ^\\ m \ r+r '\X m+1 \ r+r ' |Ai| r+r ' 
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where we temporarily adopt the convention that if i > h', A* = Aw/y/2 and Wi(S ) = 0 for all 
S. 

Alternately, if m = h! + 1 then with probability 1 — o( 1), either v is (r + 2 m + 1, x)-bad, 
v' is ( r' + 1 ,x)-bad, or 


det (A A mrr i [E]{w'))\ 

^ log 2 (n)(l - cf (’•+’■'> I] |Ai| r+r/ A (1 ~^ )Al)2 ' + ((1 _ c)Al) r 

2=1 ^ 


n h ' +1 


r/2 -a - c rx 


Proof. For each 1 < l < 2m and 1 < i < h, let 

*«(*) = ^(1 - c/A-u^A^e]^)) • ^(^[GVEjK)) 

Next, for each 1 < i < h and 0 < l < m, let tq(i) be the column vector thats jth entry is 
xi + j{i) for 1 < j < m. Also, for each 1 < l < m, let ui(h + 1) be the length m column 
vector thats jth entry is N r+ i + jy\ E Uv ■ v') — Yli =i x i+j(i) f° r 1 < j < m. Note that for each 
1 < l < m, the ith column of M mrr i(v ■ v') is Yli=i u l (*)• So, 

det {M m ,r,r'[E](v ■ V 1 )) = ^ det([ui(n), 1t 2 (*2), u m (^m)]) 

ie(zn[i,h+i]) m 

For any i E (Zfl [1, h + l]) m , if there exist j 7 ^ j' such that ij = iy < h', then Uj(ij ) = 
(1 — c )- 7- - 7 Aj J Uj>(iji), which implies that det([«i(ii), ^ 2 (^ 2 ),..., u m (i m )]) = 0. 

If m < h and i is some permutation of the integers from 1 to m, then 


det([iii(ii),7i 2 (*2), 


(na -c) J A^. j sgn(i) det([u 0 (l), u 0 ( 2 ),u 0 (m)]) 


The jth column of this matrix is proportional to ^Wj(N r \Q\ E Av)) ■ QvJj(N r ^ G \ E j(v')), so 
there exists some 7 = 7({(1 — c)Aj}, m) such that the sum of all such terms is 

w i( N r[G\E](v)) ■ Qwj(N r '[G\E]W )) 

3 = 1 

Alternately, the sum of all such terms is equal to 


det 




3 =1 


3 =1 


3 =1 


If xi (0) / 0 for each 1 < l < m and v! G (i?) m such that 
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([Ef.i -.0). E” i Mi ),.... E“-1 “■»(■»')]) 


0, then for each 1 < i < m, 


E u 'l E x w(-i) = 0 

i=i j =l 


m m 

E E “i £ ( 1 - • Q^-(JV r , [GX£f |(t/)) = 0 

i=i J=i n 

m m 

^2 w j(N r [G\E]( v )) ■ Qwj(N r , [G \ E] (v'))(l - c)*A}X>i • (1 - c) z A'- = 0 

j=l /=! 


That can only hold for all such i if u \ (1 — c) z Ay = 0 for all 1 < j < m, and that can 
only be the case if v! = 0. Therefore, the determinant is nonzero unless x/(0) = 0 for some l, 
which implies that 7 / 0. If m < h then by similar logic, there exists 7 ' = 7 / ({(l — c)A*}, m) 
such that the sum of all terms for which i is a permutation of the integers from 1 to m — 1 
and m + 1 is 

c m ra—1 

—i ■ Wm+i(N r[G \ E] (v)) ■ Qw m+1 (N r , [G \ E] (v')) Wi(N r[G \ E] (v)) ■ Qwi(N r , [G \ E] (v')) 

i =1 

That accounts for all i € (Zfl [1, h + l]) m except for some of those such that there exists 
j such that ij > min(m + 2 , h + 1 ) or there exist j,j' such that ij = m and iy = m + 1 . 

If v is (r + 2 m + 1, x)-good, then 

Wi{N r [ G /E](v))P^ l Wi{N r [ G /E\{.v)) < + x) 2 (l - c) 2r \ 2r 

for all i. Similarly, if v' is (r J + l,x)-good then 

Wi(N r ' [G / E ](y'))P~ 1 w i (N r , [G/E] (v')) < ((mm Pj )~ 1/2 + x) 2 (l - c) 2r '\ 2r> 

for all i. If both hold, then |a^(i)| < ^(1 — c) r+r ' +l \\i\ r+r ' +l+1 ((mm P j)- 1 / 2 + x ) 2 for all 
i. Furthermore, for any l and j, 

h 

I ui(h + l)j| = | N r+l+jy[E] (v ■ v') -J2 x l+j{i )I 

i= 1 

Q 

< I N r+l+j,r'[E](v ■ v') - —N r+l+ j[ G \ E ](v) ■ QN r i[ G \ E ](v') \ 

h 

+ l _ -^r+«+i[G\E]( v ) ' Q N r'[G\E](v') ~ E^+jWI 
n i =1 

< (! + \/\N r +i + j[G\E\(v) \ ■ \ N r'[G\E](v')\/n) log n 

h 

+ “ E I w i( N r+l+j[G\E]{v)) ' Qwi(N r , [G \ E] (v')) 

i= 1 

- (1 - c) J+J A- +J uj i (Ar r[G \_ B ](u)) • Qwi(N r , [G \ E] (v'))\ 
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hence 


| ui(h + l)j| < (1 + ((1 — c)\i)( r+r ' +l+ i^ 2 )Vk({mmpi) 1 / 2 + x)/y/n) logn 

+ - - c) r+l+j ~ l x\ h f lAil*^'- 1 

n i =i V i/ 

' ^^(^'[GVElK)) • QPQWiC/Vr'[GVE]K)) 

< (1 + ((1 — c)Ai) (r+r + i +j)/ 2 )v / fc((minpj) _1//2 + x)/y/n) logn 

+ ^E( 1 - c ) r+,+i " lxA '*' (^f) |Ai| /+J - 1 ((mmp J )- 1/2 + x)(l-cr , |A j: 

i=l V 1/ 

< (1 + ((1 — c)Ai)^ r+r ’ ,+/+:, ^ 2 )v / fe((minpj) _1//2 + x)/y/n) logn 

/n - rl 2 A 2 \ ( r+r ')/ 2 , 

+ —(1 - c) l+j ~ l x\ h ! ^--- -J A 1 +j ((minpj)~ 1/2 + x) 

with probability 1 — o(l). 

In other words, under these circumstances |x;(i)| is upper bounded by a constant multiple 
of f ((1 — c)|A,;|) r+r if v and v' are both good, and every entry of ui(h + 1) has a magnitude 
that is upper bounded by a constant multiple of ((1 — c)Xi)^ r+r logn/y/n + ^((1 — 
c) 2 A^,/2)( r+r ^/ 2 . Either way, every entry of ui{i) is upper bounded by a constant multiple 
of ^((! -c)|Aj|) r+r ' logn. 

That means that for any i £ (Z n [1, h + l]) m such that ij > m + 1 for some i, then 
det([ni(ii), 112 ( 12)1 ■■■1 Um(im)]) is upper bounded by a constant multiple of ^ log m (n)(l — 
c)m(r+r')\\ m+2 y+ r ' |A*| r+r ' Similarly, for i 6 (Zfl [1, h + l]) m such that ij > m - 1 

and ijt > m — 1 with j / j' , det([ui(*i), 112 (^ 2 ),•••, u m (i m )]) is upper bounded by a constant 
multiple of ^-log m (n)(l — c ) Tri ( T ’+ r ' / )|A^| r_l “ r ' / 1A^+i|' r + 7 ' / n'=~i 2 IA^| T ’ +r ' / . There are at most 
m m such i; therefore, 


m— 1 


det (M m:ry[E] (v ■ 1 /)) -- JJ Wi(N r[G \ E] (v)) ■ Qwi(N r , [G \ E ](«')) 


i= 1 


( 7 Wm(N r[G \ E] (v)) ■ QWm(N r , [G \ E] (v')) 

+ j'wm+i(N r[G \ E] (v)) ■ Qw m+ i(N r , [G \ E] (v') 


m —1 


< 


n" 


ln m+ 1 (n)(l — c) m ^ r+r '\\ m+ 2\ r+r ' J] lAil^ 


i= 1 


m —2 


+ — l n m+1 (n)(l-cr( r+r ')|A m | r+r, |A m+ i| r+r ' n |A* 


\r-\-r' 


i= 1 


with probability 1 — o(l), as desired. 

Alternately, recall that if v is (r + 2m + 1, x)-good then for any r" < r + 2m + 1 and 


r'+l 


31 



i < h, 


ll^[JVr"+l[G\Bl («)) - (1 - c )PQN r »lG\E](v) || = O j = O (^ ((1 C) n Xir ^ 

Also, for fixed values of N r m{y) for all r"' < r" < r+2m+l, each element of N r »+i[G\E]( v ) 
has a variance of 0(\N r niQ\ ct(u)|) = 0(( 1 — c)X[ ). So, 

\\w i (N r " +l+j[G \ m (v)) - (1 - c)^A' + V(AWGVE](u))|| 

<f«i^Ar +((1 _ c)Air /A ln , 


in 


with probability 1 — o(l) for all l,j < to and i < h. This implies that if u is (r+2m+l, x)-good 
and '(/ is (r 7 + 1, x)-good then 

h 

\ui(h + l)j\ = | N r+l+jy[E] (v ■ v') -^xi+jii )| 

2=1 

< |-^r+/+j ! r , [B](' y ' -^r+«+j[G\£;](^) ‘ Q N r'[G\E]W) \ 


n 


+ |-A‘ r+ ; +j [G\E](u) • QN rl[G \ E] (v') - ^2xi +j (i)\ 

n i=l 

< (! + \J\ N r+i+j[G\E](v)\ ■ \N r , [G \ E] {v')\/n)logn 
h 

~1-^ l w i(^ r r+/+j[G\E]( t ’)) ‘ ^(^'[GVElK)) 


2=1 


-(1- c) l+3 \ l + 3 Wi(N r \G\ E] (v)) ■ Qwi{N r , [G \ E] {y'))\ 

< (1 + ((1 — c)Ai)^ r+r +l+3 ^ 2 )Vk(((m.mpi)~ 1 ^ 2 + x)/y/n) logn 

+ C ~Y, ( {{ 1 ~T l)2r + ((1 _ C)Al)r/2 ) bgn • WQ W i( N r'[G\E](v'))\\ 

2=1 ' ' 

< (1 + ((1 — c)Ai)^ r+r,+i+J ' )/,2 )v / A:(((minpj) _1 / 2 + x)/\/n) logn 

+ ^ ^ - f ((1 - c)Ai) r/2 ^ logn((minp i ) _1/2 + x)(l - c) r ' |A* 


2=1 


< (1 + ((1 — c)Ai)^ r+r,+i+J ' )/,2 )v / A:(((minpj) 1//2 + x)/y/n) logn 

+ — f -f ((1 - c)Ai)' r / 2N ) logn((min Pj)^ 1 ^ 2 + x)(l - c) r, A^ +1 

n \ n J 


for any l and j with probability 1 — o(l). If to = h! + 1 then for any i £ (Z n [ 1 , h + l]) m , 
either there exist j ^ j' such that ij = ijt < /;/, or there exists j such that ij > h!. Either 


r '+1 
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way, det([«i(ii), ...,u m (i m )]) is upper bounded by a constant multiple of 


c ti + l 

^+r 1 °g( n )( 1 


w 

C ) h '{r+ r ') | A .|r+r' 

1=1 




n 


+ ((1 


c)\i) r/2 


•(1 



with probability 1 — o(l). There are only m m possible choices of i, so with probability 
1 — o(l), either v is (r + 2 m + 1,x)-bad, v' is (r' + 1, x)-bad, or 


det(M mrr i[ E \(v • t/))| 
Ji '+1 


n h ' +1 


log 2 (n)(l - c) h,{r+r '^ f] \\i\ r+r ' (" ((1 c)Al) ~' + ((1 - c)Ai) r / 2 ') • (1 - cY'Xl 

i =i V n J 


□ 


While this is a helpful result, it turns out to be more useful to have an expression that 
links the determinant to Wi(N r \ G \ E i(v)) ■ Qwi(N r i (V)) for fixed values of r and r'. So, 
we have the following: 

Lemma 6. Let 0 < c < 1, x > 0, G be drawn from Gi (n,p,Q), E be a subset of G’s edges 
that independently contains each edge with probability c, and m < h!. Now, for any v,v' £ G 
and \/ln n < r' < r € Z + , such that ((1 — c)\ 2 h ,/2) r+r ' > AJ +r n, with probability 1 — o(l), 
either v is (r + 2 m + 1 ,x)-bad, v' is (r' + 1 ,x)-bad, or 


det(M mr y[E\{v ■ v')) -- ((1 - c)Aj) 


m— 1 


\r-\-r'— 2\/lnn 


Wi i^Vlnn[G\E] ( U )) ' Q Wi ( N VhLn[G\E} (^0) 


i= 1 


• (7((1 - c)A„,) r+r '- 2 ' /& «. m (iV vl ^ ICXE| (»)) • Q^ m (N^r n[ ^ Ei (o')) 

+ 7 ((1 — c )Am+l) r+r 2 ' 2 ''l"uf m +i(A r G j^| G ^ E |(ll)) ' )))l 


< 


1 


ln 2m+2 n n m 


ni(i-c)A ( 


r-\-r' 


i =1 


Proof. First, note that 


Wi(N r[G/E }{v)) ■ Qwi(N r , [G/E] (v')) 

- «i - C )A i ) r+ ''- 2 ' /I ^«. i (JV v=[c/is| (n) ■ 

< |Aj[mi(JV r[c/ q(o)) • P-\w,(N r , [G/E] (v')) - ((1 - c)A0''~ v ^K>i(iV,/E^ c ./£](</))) 


+ (”'iW[G/E](»)) - ((1 - c)A() r ' /nr ”«>i(A r v 7I7[G/E]('’))) 

• P-\( 1 - c)N-^m(N^ G/Ei (v'))]\ 

x\(l - c)\i\ r ' 

2\Zlnn 


< \Mj w i( N r[G/E](v)) ■ P 1 m(N r[G/E] (v)) 


+ • i(i - c)A,r'-' / ^Vu,,(^ |G/E] ( t ,')) • 


< 


2 A/ln n 

2x(l — c) r+r '|Aj| T ' +r '+ 1 

2 \/ln n 


((minpj) 1 / 2 + x) = o(|(l — c)\\ r+r '/ ln 2m+2 (n)) 
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Also 


\\m + 2\ r+r '/\Xm\ r+r ' < n - 1 “d A -/W2|)/ln((l-c)A2,/(2A 1 )) = 0 (l/l n 3m+3„) 

and 

\X m+ l\ r+r '/\X m -l\ r+r ' < n - ln (|A m -i/A m+ i|)/ln((l-c)A2,/(2Ai)) = 0 ( 1 /l n 3m+3^ 


Combining these inequalities with the determinant lemma yields the desired result. □ 


In some sense, this establishes that 


det {M m , r y[E](v ■ v')) 


c 

rV 


m— 1 

n ((1 - c)A i ) r+ ''- 2 '^u. i (JV^ |GVB] ( t ,)) ■ 


i —1 


■ M(i - • 0 »„(NvEA[ O x E ](”')) 

+ 7 ((1 - c)Ara+l) r+r ’ Qw m + l(^ r ^li^i[GyE]( l; ))) 

However, in order to know this in a useful sense it is necessary to prove that these terms are 
large relative to the error terms. In order to do that, we need the following: 


Lemma 7. For any p € (0, l) fc with Jfpi = 1 an d k x k matrix Q with nonnegative entries 
such that every entry of Q k is positive, there exists a unique w € (0, oo) fc such that w is an 
eigenvector of PQ with eigenvalue X\ and w-P~ 1 w = 1. Now, let G be drawn from Gi (n,p, Q), 
v G G, and w' be an eigenvector of PQ with eigenvalue A* such that Xf > X\. With probability 
1 — o(l) either v is (%/Irm, mm(P~ 1 w)i/2)-bad, or \w' • P~ 1 N^ ^(n)! > x/^/lnn. 

Proof. Every entry in ( PQ) k is positive, so its eigenvector of largest eigenvalue is unique up 
to multiplication, and its entries all have the same sign. That means that it has a unique 
multiple, w, such that w • P~ 1 iu = 1 and w\ > 0. In this case all of w’ s entries must be 
positive, and since ( PQ) k w = X\w, it must be the case that PQw = X\w. 

Given any (Vln n, min(P~ 1 w)j/2)-good vertex v, and any r < Vlnn, 


w ■ P 1 N r (v ) > X[w ■ P 1 {n} 


min(P 1 w)j 
2 


r— 1 

^ > A ? ,’ min(P~ 1 rc)j/2 

3=0 


For any eigenvector w" with eigenvalue Xp and w" ■ P 1 w" = 1, 


re 


■P~ l N r (v)\ < \Xi 


\ r w"-P~ l 


W+ 


min(P 1 w) 


r— 1 


^2^'- i ivr<ivi 

3=0 


_1/2 i 

(min p- +min(P~ l w)j/2) 


Ai > A 2 , so there exists a constant vq such that for any r > r 0 , 


N r {v)j > 


min(P l w) 


A \wj 


for all j. For r$ < r < Vlnn, 1 < j < k, and a hxed value of N r (v), the probabil¬ 
ity distribution of N r +i(v)j is within o(l) of a poisson distribution with expected value 


34 



( PQN r {v))j . Furthermore, N r+ i(v)j/ has negligible dependence on N r+ \(y)j for all j' / j. 
So, w' • P~ l N r+ \(y) has an expected value of A iw' • P~ l N r (v) + o(l) and a standard deviation 
of 



In particular, this means that if there exists ?’o < r < 2 lnlnra/ln(A?/Ai) such that | w' ■ 
P~ 1 A ? r (u)| > A^ 2 In In In In n then with probability 1 — o(l), 

\w' ■ P~ 1 N'/^ n (v)\ > \ f™- r \ r i 12 In In In Inn — \ r ^ 2 \/\n. In In In n 

> • {\ 2 /Xi )- r/2 

> \f™/\nn 

Furthermore, since the probability distribution of w' ■ P _1 IV r (r;) for a fixed value of 
N r _i(v) is a sum of constant multiples of poisson distributions with expected values of fi(A£), 
it must be the case that | w' ■ P~ 1 N r (v)\ > A^ 2 In In In Inn with probability e -0( ln2 lnlnln «) = 
o;(l/lnlnn). Therefore, there exists ro < r < 2 In Inn/ ln(A?/Ai) such that this holds with 
probability 1 — o(l), and the lemma follows. □ 

This also implies that for any v , v', and i, either v is (\/lnn, min(P _1 rc)i/2)-bad, v' 
is (v / Inn,min(P“ 1 u;) i /2)-bad, or \wi(N^ j^(w)) • P^i^N/j^Cy'))| > A? v ^"/ln 2 n with 
probability 1 — o(l), since the degree of dependence between and is 

negligible. This allows me to attempt to approximate PQ's eigenvalues as follows. 

Lemma 8. Let 0 < c < 1, mm(P~ 1 w)i/2 > x > 0, G be drawn from Gi (n,p, Q), E be a 
subset of G’s edges that independently contains each edge with probability c, v € G, such 
that (1 — c)(A 2 ,/2) 4 > A[. 

With probability 1 — o(l), either v is (| • log n/ log((l — c)X\),x)-bad or Basic-Eigenvalue- 
approximation-algorithm(E,c,v) runs in 0(n ) time and returns (A^,..., X' h „) such that h! = h" 
and | A' — A*| < ln~ 3/,2 (n) for all i. 
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Basic-Eigenvalue-approximation-algorithm(E,c,v): 

Compute N r r G \(v) for each r until |iV r [ G \ E ](t;)| > y / n, and then set A" = 2 {/n/( 1 — c). 

Set r = r' = | logn/log((l — c)A") — \Ann. Then, compute 

2 / ~^ m ax(| det M m;rir[g ](u • u)|, [ det A// m , r+ i, r[E ](^ • u)|) 

V cmax(| detM m _ l r)r [ S ](u • u)|, | det M m _ hr+1 , r[E] (v • u)|) 

until an m is found for which this expression is less than ((1 — c)A / 1 / ) 3 / 4 + J— . Then, set 
h" = m — 1. 

Then, set 

1 _ j—l 

|A'| = —— Jdet(M^ r+3y[E] (v ■ v'))/det(Mi, r+1 y [E] (v ■ v'))/ JJ(1 - c)|A' | 

C 3=1 

unless \det(M ijr+1 y[ E] (v ■ v'))\ < yj\ det(M^ r y [E] (v • u'))| • | det {M itr+2 y[ E ](v • u'))|, in 

which case set |A'| = det(M itr+2 y[ E ](v ■ v'))/ det (M itr y [E] (v ■ v'))/ n}=i(! — c)|A'-|. 

Repeat this for each i < h" 

Next, for each i < h" , if | |A'| — |A' +1 || < ^ then set A' = |A'| and A / i+1 = — |A' +1 |. For each 
i < h" such that ||A'| - |A' +1 || > ^ and 11A<_ x | - |A'|| > ^ set 

1 T-r 

A' = —— det (M itr+iy[E] (v ■ v'))/det(M i}ry[E] (v ■ v'))/ JJ(1 - c)A'- 

-L ^ -t 

3 = 1 

Return (A',,..., A^„) 
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Proof. If v is (| • logn/log((l — c)Ai,x)-good, there exists a constant r such that for any 
r < r" < | • log n/ log((l - c)Ai), 


min(P 1 w) 


k 

((! - c)Ai) r " w i < l Ar r"[GVE](W0])| < ((1 - c)Ai) r "\/fc((minpi)' 1/2 + x) 


3 =1 


So, the minimum r" such that | -^ r r- ,/ [G\ J E7] (' u ) I > V™ within a constant of log nj2 log((l—c)Ai). 
That means that |Ai — A”) is upper bounded by a constant multiple of 1/logn, and that 
r = r' is within a constant of the value it would have if were Ai. This also implies that if 
n is sufficiently large it is less than | • logn/log((l — c)Ai) — 2 h! — 3. 

For m < h', r" € {r,r + 1}, and v G G if |A*| / |A*+i| then 

,_ det (M m , r ", r "[E](v • v)) _ , _ , 

7({(1 ~ti j} ’ m)c> ” rn=i((i - c )\) 2r ''~ 2 ^ w i{N^ [G \ E] {v)) ■ Qwi{N V ^ 1[G \ E] (v)) 

with probability 1 — o(l). If |A*| = |A.;+i| then A,; = — Aj+i, so either 

7((1 - c)\ m ) 2r - 2 ' A ™w m {N 'vi^ [G \E]('y)) • Qwm{N^ i^ [G \ £] (v)) has the same sign as V((l - 
c)A m+ i) 2r 2 ^^w m+ i(N^^^ E ^(v)) ■ Qw m + i(^\ r v dFT[G\E](' u )) or 

l({l-c)X m ) 2r+1 ~ 2 ^w m (N^ [GXE] (v))-Qw m {N^ i[GXE] (v)) has the same sign as ^((l- 
c)\ rn +i) 2r+1 ~ 2 '^iv m +i(N'^ [G \ E] (v)) ■ Qw m +i( N ^,[ g\e]( v )))■ Either way, we have that 


(l-o(l))| 


7({(1 -c)\j},m)c n 


n" 


n«i - ^ 


\2r—2\/ln n 


W i( N Vh^[G\E]( V )) ‘ Q W i( N V^[G\E]( V ))\ 


i =1 


< max(| det(M mjf . jT . [£ ](v • u))|, | det(M m;r+1 • u))|) 


m —1 




\2r+l—2\/lnn 


w i(Ny/]E n[G\E]( V )) ' Q Wi (^Vhin[G\E]( V ))\ 


i =1 


• (l7((l — c)A m )^ r+1 • < 3 u ’m(-^ r N /Inn[G\£;]( t ’))l 

+ l7 ((1 ~ c )A m +l) + v/ W m+ l(-A r % /i n ^j[G\E](^)) ‘ Q'Wm+1 (-^vdnn[G\i?] (^0 ) I) 

with probability 1 — o(l). Furthermore, by lemma 8, these bounds are within a factor of 
0(ln 2 n ) of each other with probability 1 — o(l). 

That means that 


nmax(| det M m ^ r [ E ](v • u)|, | det M nhr+1 , r[E] (v • u)|) 
cmax(| detM m _ ljrjT . [£ ;](u • u)|, | det M m _ l r+hr[E] (v • u)|) 


is within a factor of ln 3 (n) of |((1-c)A m ) 2ri 2x ^^d m (^iy i ^ [GV£] (u))•Qu; m (A^y i ^ [GVE] (^;))| 
with probability 1 — o(l), and thus that 


2r 


n max(| det M m ^ r [ E ] {v ■ v) U det M m , r+hr[E] (v • v) \) 
cmax(| det M m _ hr)J , [E] (v • u)|, | det M m _ hr+hr [ E] (v • u)|) 


|(1 - c)A m | ± o(l) 
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with probability 1 — o(l). |(1 — c)\ m \ > \J (4(1 — c) 3 A 3 , so with probability 1 — o(l), this 
expression is not less than ((1 — c)A'/) 3 / 4 + 

If m = h! + 1 , r" € {r, r + 1}, and v G G, then with probability 1 — o(l) either n is 
(r" + 2m + 1, x)-bad or 


\det(Mm :r y[ E ](v -v))\ 


< dhl i n =(„ )(1 _ c) w n | A y" ( ((1 ~hhiih + ((1 _ c)Al) V 

i=i V y 


• (i - c) r a; 


c ti+l 


ln 2 («)(l - c) 2 '*'-’-" XI - ((1 - c)^ 1 ) 3r " /2 


2=1 


If v falls under the later case, 

nmax(| det (v • n)|, | det M m r+1 y E] (v • n)|) 

cmax(| det M m _ hT)r [ E ](v • u)|, | det M m _ hr+1 y E] (v • n)|) 

, so 


0(ln 2 (n)((l-c)Ai) 3r / 2 ) 


nmax(| det M m , r ,r[E](v • ^)1, 1 det M m , r+1 y E] (v • ^)|) 
cmax(| det M m _ l rtr[E] (v • u)|, | det M m _ l r+l r[E] (v • u)|) 


((1—c) Ai) 3 // 4 +0(ln(ln(n))/ ln(n)) 


Therefore, with probability 1 — o(l), either v is (| • logn/log((l — c)Ai),x)-bad or h" = h'. 
By the previous two lemmas, with probability 1 — o(l) either v is (r + 2m + 4, a:)-bad, or 


for all i < h and 



> A 271 ™/In 2 n 


det {M m yy[ E ](v ■ v ')) 


m— 1 


n m II ^ c)Aj) r +r 2 '^^Wi{N y /^ n { G \ E ](v)) ■ QWiiN^/^iQ^iv 1 )) 

2=1 

• (t(( 1 _ c)A m ) + y/ u ’m(-^ r v /hIT[G\£;]( t ’)) ' ~[ G \E]( V )) 

+ 7 ((1 — c)A m+ i) r + ' + ' Q W m+l(^^]^[G\E]( V ) 

1 


< 


ln 2m+2 n ri n 


nia-^ 


\r +r 


2=1 


for all m < b! + 1 and r < r" < r + 4. 

Assume that the later holds. If |A m +i| < |A m | then if n is sufficiently large the y' term 
will be negligable relative to the 7 term, while if |A m +i| = |A m | increasing r" by two would 
multiply both of these terms by (1 — c) 2 A, 2 n . Either way, we have that for any k <h! and 
r < r" < r + 1 , 


k 

det(M m y, +2 y[E]{v ■ v'))/ det(M m y,y [E] (v ■ v')) - JJ((1 - c)\j) 2 \ < 

J=1 


1 

In' / 8 n 
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as long as n is sufficiently large, and 

|7((1 -c)A m ) + v/ w m{N^/^y G \ E ]{v)) ■ Q w m {N^/^^ G \ E ]( v )) 

+ 7 ((1 - c)A m+ l) r +r Jv/ ^"w m +l((^)) ' < 3 u ’m+l(^ r v / Irin[G\E](' U ))l 

^ 2 ^^'*' — c )^m) + ^ ■ Qw m {N^^^ G ^ E ^{y ))| 

We assumed that n is large, so that can only fail if |A m | = |A m +i|. In this case, A m = — A m _|_i, 
so increasing rn by one would result in both terms having the same sign, at which point the 
condition is satisfied. So, 

||A'| 2 - |Ai| 2 | < 4In" 7/8 n 

for any i < h!. For any i < h', if | A,; | > |Aj+i|, then for sufficiently large n, | A ■ | — |A' +1 | will be 
greater than On the other hand, if |Aj| = |Aj+i| and n is sufficiently large, |A'| — |A' +1 | 
will be less than . So, the algorithm will suceed at determining which eigenvalues have the 
same absolute values and assign A' the same sign as A* for each i. So, |A' — A* j < In _3 / 2 (n) as 
desired. Assuming this all works, the slowest part of the algorithm is computing expressions 
of the form N r ny\E](v ■ v ), each such expression can be computed in 0(n ) time, and only a 
constant number of them need to be computed. So, the algorithm runs in 0(n) time. □ 

So, this algorithm sometimes works, but it fails if v is bad. This risk can be mitigated 
by using multiple vertices as follows. 


Improved-Eigenvalue-approximation-algorithm(c): 

Create a set of edges E, that each of G’s edges is independently assigned to with probability 
c. 

Randomly select \/ln n of G”s vertices, u[l], u[2],..., u[\/ln n\. 

Run Basic-Eigenvalue-approximation-algorithm(Fi, c,v[i]) for each i < Vlnn, stopping the 
algorithm prematurely if it takes more than O(nVlnn) time. 

Return (A^,..., \' h „) where h" and A' are the median outputs of the executions of Basic- 
Eigenvalue-approximation-algorithm for each i. 


Lemma 9. Let 0 < c < 1, min(P l w)i/2 > x > 0, and G be drawn from Gi (n,p,Q). 
Improved-Eiqenvalue-approximation-alqorithmfc) runs in 0(n log n) time. Furthermore, if 
(1-c)(A^/2)4>AI,x<^, and 

x 2 (l-c)A^, min Pi / X 2 (1-c) A 2 , min Pi (l-c)A^, \ _. 

2 he 1 6A 1 fe 3 / 2 ((minp i )^ 1 / 2 +a:) / f \ _ g 16Aj fe 3 / 2 ((min Pi ) ~ 1 / 2 +x) 4a| I < 

l ) 2 
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then Improved-Eigenvalue-approximation-algorithm(c) returns (A' 1; ...,X' h „) such that h! = h" 
and |A' — Aj| < In _3 / 2 (n) for all i with probability 1 — o(l). 

Proof. Generating E takes 0(n) time, picking v[i] takes o(n) time, each execution of 
Basic-Eigenvalue-approximation-algorithmic) runs in 0(n \/log n) time, and combining their 
outputs takes o(n) time. So, this algorithm runs in 0(n log n) time. 

Assuming the conditions are satisfied, there exists y < \ such that 1 — y of C s vertices 
are (| • logn/log((l — c)Ai,,x)-good with probability 1 — o(l). So, with probability 1 — o(l), 
the majority of the selected vertices of G are (| • logn/log((l — c)Ai),x)-good and the 
majority of the executions of Basic-Eigenvalue-approximation-algorithm give good output. 
If this happens, then the median value of h" is h!, and for each 1 < i < h 1 , the median value 
of A- is within In~ 3 / 2 (n) of A* for each i, as desired. □ 

Given approximations of PQ's eigenvalues, one can attempt to approximate A*({u}) • 
P _1 IVj(-{V}) as follows. 


Vertex-product-approximation-algorithm(v,v’,r,r’,E,c,(A^,..., \' h „)): 

(Assumes that iV r "[G\K] (y) has already been computed for r" < r + 2h" + 3 and that 
N r "\G\E]{ v ') has already been computed for r" < r') 


For each i < h ", set 


Zi(v ■ v') 


det (M ijr+iy[E] (v ■ v') - (1 
det(Mj_i r+ i r /[£][v ■ v') - (1 
n{K-i ~ A i)7({(l - c)A' },i- 
'^(A'-A^haa-cjA'i, 


- C YK.+1 HU x 'j det {Mi )r y\E\(v • v') 

- cy-'K n}=i X 'j det {Mi_ hr y[E\{v ■ v’) 
kui-cJA ')-"'- 1 


Return {z\[v ■ v '),..., Zh"{v ■ v')). 


Lemma 10. Let 0 < c < 1, min(P _1 u;)i/2 > x > 0, G be drawn from Gi (n,p,Q), E be 
a subset of G’s edges that independently contains each edge with probability c, v,v' 6 G 
and vdn n < r' < r G Z + , such that ((1 — c)\‘f l ,/2) r+r > A^ + ' n. Also let (A^,..., X' h ,,) 
be a h"-tuple which may depend on n such that h" = h! and |A' — A* | < In _3 / 2 (n) for 
all i. Assume that N r rit G \ E }(v) has already been computed for r" < r + 2m + 3 and 
that N r ii!Q\ E Av / ) has already been computed for r" < r'. Vertex-product-approximation- 
algorithm(v,v',r,r',E,c,(X' 1 ,...,X , h ,i )) runs in 0(((1 — c)Ai) r/ ) average time. Furthermore, 
with probability 1 — o(l), either v is (r + 2 m + 4 ,x)-bad, v' is (r 1 + 1 ,x)-bad, or Vertex- 
product- approximation- algorithm(v, v', r, r ', E, c, (A^,..., A^,,)) returns {z\{v ■ v '),..., Zhfv ■ v')) 
such that | Zi(v ■ v') — tCi({u}) • P~ 1 Wi({v'})\ < 2x{na.\n.pj)~ 1 / 2 + x 2 + o(l) for all i. 
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Proof. The slowest part of the algorithm is computing the expressions of the form N r ny rgi (v ■ 
v'). Each of these can be computed in 0(E[\N r nQ\ E ^(v')\}) = 0(((1 — c)Al) ^, ) average time, 
and the number of these that need to be computed is constant in n. So, this algorithm runs 
in 0(((1 — c)Ai) r ) average time. 

With probability 1 — o(l) either v is (r + 2 m + 4, x)-bad, v’ is (r J + 1, ,x)-bad, or 


for all i < h and 


det (M m y,y [E \(v ■ v')) 


c 

~n n 


m— 1 

II ((! - c)A• QMN^ n[GXE] W)) 


■ (t(( 1 c)A m ) r +r 2y /^Wm(N y /^[ G \ E ](v)) ■ Qw m {N^/^^ G \ E ^{v )) 

+ 7'((1 - c)A m+ i) r +r ~ 2y/ ^w m+ i■ Qw m+ i(A r v ^ 7 [G \ E] (w')))| 

1 r m -111- 

i=l 

for all m < h! + 1 and r < r" < r + 4. 

Assume that the later holds. 


i —1 


det (M itr+iy[E] (v ■ v ')) - (1 - c)*A' +1 A'- det(M i>r>r , [B] (u • v')) 

3 =1 




\r+l+r'—2\/lnra 


< 


A* 

— I — 

In n n* 


rr 




w i( Ar VhTT[G\£;]( v ))' < 3 u; i( iV v / IirT[G\E]( i;/ ))l 


i=i 


n(G - ^ 


\r-\-l-\-r'—2y/\nn 


W j( N VhPn[G\E]( V ^ ' Q W j( N VhPi{G\E]( V ))l 


3 = 1 


for any i < h with probability 1 — o(l). So, 

det(M i)r+iy[E] (vv')-(l- c) 1 A- +1 n}=i A'- det(-M*,r,r'[i?] (u • r/) 
det {Mi_ 1)r+iy[E] (v ■ v ') - (1 - c)* _1 A' [I}=i A'- det (Mi_ hry[E] (v ■ v') 

7(1(1 ~ c )Aj}0) Aj_i(Ai - A i+ i) 

7({(1 - c)Aj},i - 1) Ai(Aj_i — Aj) 

• G(1 - c)A,;)'+ a - 2 ^«*,.(JV v ^; |gve| („)) ■ 0»,(Ar^ [cw (»'))l 


< 


1 


i((i- C )A 4 r +r ' +i i 


n\J In n 

with probability 1 — o(l). Therefore, 


I Ziiy ■ v') - x i x ((l - c)Ai) 2 ' A ™MN yA ^ n[G \ E] (v)) ■ QMN^G^iv'))] = o(l) 
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with probability 1 — o(l). That implies that 


\Zi - Wi({v}) ■ P 1 U/j({'l/})| 

— \ z i ~ (1 — c )((l _ C ) A *) 2x// ^ ^U>i (-^ v / InT[G\-E] (^)) ' Q w i(^x/]Pn[G\E]( V ))l 

+ (1 - c) • l((l - c )^i)“ 2 ' ,S ”" lt <’.( A '^;[G\E]('’)) • Q W <( N ^G\E] ( . V ")') 

- ((i - • Q«x(Jv^ [<AB i(t/))| 

+1(1 - C)■ «i - ■ <3»,(iv^ IG ^,(t»')) - «*««»• p-'w ,({»'}))! 

< l((i -c)A i )-' /Ei <i>i(A r GEi: [ G\Bi(»')) • r-'Ka - -«,,({«})]! 

+ K({»}) ■ p-‘(((i - c)\ l )-' / ^m(N v ^ l ^ E] (v')) - »,({»'})] + o(i) 

By goodness of v and v', this is less than or equal to 
((l-c)Aj) v/ ^y/u)j(A r yj^j G y E j(u')) • P lw i(^ v ^[G\B]( t ' , )) • x 
+ y/wi{{v}) • P~ 1 w i ({v}) ■ x + o(l) 

< y/wi(W}) ■ P~ l Wi{{v'}) + 2wi({v'}) ■ P-M(( 1 - c)\i)-'^w i (N x ^ [GXE] (v')) - U7i({u'})] + 

[((1 - c)\i)-^Wi(N - Wi({u'})] 

•- p “ 1 [(( 1 - c)A i )~'^u; i (Ar^ [GV?] (u / )) - «><({«'})] • X 

+ X\J 1 / min pj + o(l) 

< \J 1 /rninpj + 2x/y / min pj + x 2 x + x/y/mmpj + o(l) 

= ( x 2 + 2x(minpj) -1 / 2 ) + o(l) 

with probability 1 — o(l), as desired. □ 

For any two vertices in different communities, v and v the fact that Q 1 s rows are distinct 
implies that Q({uj — {V}) ^ 0 . So, iCi({w}) j - Wj({V}) for some 1 < i < b! . That means 
that for any two vertices v and v', 

MM) - «*(M)) • ^MM) - «*(M)) 

= Wi({u}) • P^Widv}) - 2wi({v}) ■ p-'wittv'}) + ^({i/}) • P~ 1 Wi({v'}) > 0 

for all 1 < i < h', with equality for all i if and only if v and v 1 are in the same community. 
This also implies that given a vertex v, another vertex in the same community v f , and a 
vertex in a different community v", 

2tCi({u}) • P^Widv'}) - Wi({v'}) ■ P^Wiiiv'}) 

> 2 Wi({v}) • P~ l Wi({v"}) - Wi({v"}) ■ P^ 1 Wi({v”}) 

for all 1 < i < h! and the inequality is strict for at least one i. This suggests the following 
algorithms for classifying vertices. 
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Vertex-comparison-algorithm(v,v\ r^Epc^^,\' h ,,)): 

(Assumes that N r inQ\ E ](v) and N r inQ\ E i(v r ) have already been computed for r" < r + 2/i" + 3) 

Run Vertex-product-approximation-algorithmic, v', r, r', E, c, (A(, X' h „)), Vertex-product- 
approximation- algorithmic , v, r, r', E, c, (A(, X' h „)), and Vertex-product-approximation- 
algorithm^' , v', r, r', E, c, (A' l5 A'^,,)). 

If 3i : Zi(v ■ v ) — 2zi(v ■ v') + Zi(v' ■ v') > 5(2x(minpj) -1 / 2 + x 2 ) then conclude that v and v' 
are in different communities. 

Otherwise, conclude that v and v' are in the same community. 


Lemma 11. Assuming that each of G’s edges was independently assigned to E with prob¬ 
ability c, this algorithm runs in 0(((1 — c)Ai) max ( r,r •*) average time. Furthermore, if each 
execution of Vertex-product-approximation-algorithm succeeds and 13(2x(minpj) -1 / 2 + x 2 ) 
is less than the minimum nonzero value of (rcj({u}) — u;j(-{V})) • P _1 (rcj({u}) — u^({V})) 
then the algorithm returns the correct residt with probability 1 — o(l). 


Proof. The slowest step of the algorithm is using Vertex-product-approximation-algorithm. 
This runs in an average time of 0(((1 — c)Ai) max ( r,r )) and must be done 3 times. If each 
execution of Vertex-product-approximation-algorithm succeeds then with probability 1 — o(l) 
the Zi are all within |(2x(minpj) -1 / 2 + x 2 ) of the products they seek to approximate, in 
which case 

Zi(v ■ v) — 2 Zi(v ■ v') + Zi(v' ■ v') > 5(2x(min Pj)~ X ^ 2 + x 2 ) 

if and only if 

MM) - M(M)) • - wi{{v'})) / 0, 


which is true for some i if and only if v and v' are in different communities. 


□ 


Vertex-classification-algorithm(v[],v’, r^E^M,..., X' h „)): 

(Assumes that N r i^Q\ E j(v[a])ha,ve already been computed for 0 < a < k and r" < r + 2/r" + 3, 
that N r n\Q\ E \(y') has already been computed for all r" < r' , and that Zi{y[a} ■ u[cr]) has 
already been computed for each i and cr) 

Run Vertex-product-approximation-algorithm(v[a], v' , r, r', E, c, (A' 1; ..., A'^,,)) for each a. 
Find a a that minimizes the value of 

max Zi{v[a] • v[<j\) — 2zi(v[o] ■ v') — [zi(v[ij'] ■ v[(r']) — 2zi(v[a] ■ r/)j 

o' ; rr.i< li" 

and conclude that v' is in the same community as v[cr}. 
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Lemma 12. Assuming that E was generated properly, this algorithm runs in 0(((1 — c)Ai) r/ ) 
average time. Let x > 0 and assume that each execution of Vertex-product-approximation- 
algorithm succeeds (including the previous ones to compute Zi(v[a] ■ v[a])), and that t>[] 
contains exactly one vertex from each community. Also, assume that 13(2x(min + x 2 ) 
is less than the minimum nonzero value of (iuj({v}) — u^-jV})) • P~ 1 (wi({v}) — Wi({v'})). 
Then this algorithm classifies v 1 correctly with probability 1 — o(l). 

Proof. Again, the slowest step of the algorithm is running Vertex-product-approximation- 
algorithm. This runs in an average time of 0((( 1 — c)Ai) r ) and must be done k times. If 
the conditions given above are satisifed, then each Zi is within |g(2x(minPj) _1//2 + x 2 ) of 
the product it seeks to approximate with probability 1 — o(l). If this is the case, then 

2wi({v'j) ■ P^Widvla]}) - ^i({n[(j]}) • P~ l Wi({v[a]}) 

> 2 Wi({v'}) ■ P~ l Wi({v[a']}) - ^({^[cr']}) • P~ l Wi({v[a'}}) 
for all i and a' if v' is in the same community as v[a\, and 
2wi{{v'}) ■ P -1 u;i({d 0 -]}) - tOi({u[<r]}) • -P _1 Wj('H 0 ']}) 

< 2wi({v'}) ■ P~ l Wi({v [cr 7 ]}) - • p ~ lw i({ v [ a ']}) - 13(2x(min Pj)~ 1/2 + x 2 ) 

for some i and cr otherwise. So, 

19 

Zi(v[a\ ■ v[a}) - 2zi{v[o\ • v') < Zi(v[a'] ■ v[cr]) - 2Zi(v[a’\ ■ v') + y • (2x(min Pj )~ 1/2 + x 2 ) 
for all i and a iff v' is in the same community as v[a\. So, 

19 

max Zi(v[a}-v[a])—2zi{v[<j\-v') — [zi(y[(j']-v[cr'Y)—2zi(v[(j']-v')\ < —-(2x(min pj)~ 1 ' 2 -\-x 2 ) 
a'^a,i<h" 3 

iff v' is in the same community as v[a\. Therefore, the algorithm returns the correct result 
with probability 1 — o(l). □ 

At this point, we can finally start giving algorithms for classifying a graph’s vertices. 

Lemma 13. Let x' > 0, and assume that all of the following hold: 

e < 1 

(1 - c )^h' > 4Af 

, Ai k 

0 < x < x < - - 

\ h ‘ mm pi 

(2(1 — c )Ai/A^/) 1-£ / 3 < (1 — c)Ai 

(1 + e/3) > log((l - c)Ai)/ log((l - c)A£,/2Ai) 

13(2x / (minp i ) _1/2 + (x') 2 ) < mm(roi({!)}) - Wj({^})) ' ~ Wi({V})) 

3k such that every entry of Q k is positive 

3w £ R k such that QPw = Ai w,w ■ Pw = 1, and x < mm. Wi/2. 

h" = h! 

| A* — A(| < In _3 / 2 (n) for all i 
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Unreliable-graph-classification-algorithm(G,c,m,e,x,(A / 1 , 

Randomly assign each edge in G to E independently with probability c. 

Randomly select m vertices in G , u[0], — 1]. 

Let r = (1 — |) logn/ log((l — c)A , 1 ) — \/lnn and r 7 = y • logn/log((l — c)A 7 ,) 

Compute N r r,[ G \ E \(v[i]) for each r" < r + 2 h" + 3 and 0 < i < m. 

Run vertex-comparison-algorithm(y[i\ , u[j], r, r', E, x, (A^,\' h „)) for every i and j 

If these give consistent results, randomly select one alleged member of each community v'[a\. 
Otherwise, fail. 

For every v" in the graph, compute N r iiu j\e]{v") for each r" < r'. Then, run 
vertex-classification-algorithm(v'\\,v",r,r', E, (X' h „)) in order to get a hypothesized 
classification of v" 

Return the resulting classification. 


ai 2 (l — c)\ 2 , minp^ 


■ 2 (l-c) A ^rninp,- (( (l"^' ) 1) 


Let y = 2 ke 16A l fc3 / 2 (( min Pi) 1 t 2 +x) j 1 — e 16X 1 k 3 / 2 ((minp i ) 1 / 2 +x) 4A 1 


_^_ x ,2 (l-c)* 2 h ,™inpi (l-c)A^ 

and y 1 = 2 ke 16A 1 fe 3 / 2 (( m inp i ) _1 / 2 +i') / I \ _ g 16A 1 fc 3 / 2 ((minp i ) _1 / 2 + a :') 4Xf 


a/ 2 (l-c)A 2 , minp^ 


T/iis algorithm runs in 0(m 2 n l ~ 3 +n 1 + s e ) time. Furthermore, with probability 1 — o(l), 
G is such that Unreliable-graph-classification-algorithm(G,c,m,e,x, (A^,..., A^„)) /ms at least 
a 

1 — fc(l — rniripj)"' — my 

chance of classifying at least 1 — y' of G’s vertices correctly. 


Proof. Let r$ = (1 — |) log n/ log((l — c)Ai). There exists y" < y such that if these conditions 
hold, then with probability 1 — o(l), at least 1 — y" of G’s vertices are (?’o,x)-good and the 
number of vertices in G in community a is within yTilogn of p a n for all a. If this is the 
case, then for sufficiently large n, it is at least 1 — k( 1 — min pi) m — my likely that every one 
of the m randomly selected vertices is (ro,x)-good and at least one is selected from each 
community. 

If v[i] is (ro,x)-good for all i, then with probability 1 — o(l), vertex-comparison- 
algorithm(v[i\, v[j], r, r', E, x, c, (A' 1; ..., A 7 h „)) determines whether or not v[i\ and v\j\ are 
in the same community correctly for every i and j, allowing the algorithm to pick one mem¬ 
ber of each community. If that happens, then the algorithm will classify each (r 7 + h' , .x 7 )-good 
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vertex correctly with probability 1 — o(l). So, as long as the initial selection of w[] is good, the 
algorithm classifies at least 1 — y' of the graph’s vertices correctly with probability 1 — o(l). 

Generating E and u[] takes O(n) time. Computing N r „i G \ E j ( 7 ;[z]) for all r" < r + 2 h! + 3 
takes 0{m\ U r " = 0(mri) time, and computing N r n\Q\ E i(v') for all r" < r' 

and v' € G takes 

0(n\ L) r »< r > N r „[ G \ E] ) = 0(n • ((1 - c)Ai) r ') = 0(n 1+ i e ) 
time. Once these have been computed, running Vertex-comparison- 

algorithm(v[i\,v[j],r,r', E, x, (X'i, \' h „)) for every i and j takes 0{m 2 • ((1 — c)Ai) r ) = 
0(m 2 n 1_ 5) time, at which point an alleged member of each community can be found 
in 0(ra 2 ) time. Running Vertex-classification-algorithm(v'\\, v",r,i J , E,c, (\' h „)) for 
every v" E G takes 0(n ■ ((1 — c)Ai) r/ ) = 0(?r 1+ 3 e ) time. So, the overall algorithm runs in 
0(m 2 n l ~3 + n 1+ 3 e ) average time. □ 

So, this algorithm can sometimes give a vertex classification that is nontrivially better 
than that obtained by guessing. However, it has an assymptotically nonzero failure rate 
and requires too much information about the graph’s parameters. In order to get around 
that, we combine the results of multiple executions of the algorithm and add in a parameter 
analysis procedure as follows. 

Lemma 14. Assume that there exist x, x', and e such that x is either a unit reciprocal or 
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Rehable-graph-classification-algorithmic, rn, <5, T(n)) (i.e., Agnostic-sphere-comparison): 
Run Improved-Eigenvalue-approximation-algorithm{.l) in order to compute (A), X' h „) 

Let A" = A) + 2 In~ 3 / 2 (n), A'/„ = X' h „ — 2 In~ 3 / 2 (n), and k' = [1 /<5J 


Let x be the smallest rational number of minimal numerator such that 


_ * ^ h"> " /_ x ( V' } 6 (( ^h'” \ ,a 

jt'(l - S) m + m ■ 2k'e 1 6A' 1 '(fc') 3/2 ((«)" 1/2 +^) j ( 1 _ e 1 6A , 1 / (fe') 3/2 ((-s)” 1/2 +^) u 4 ( A i') 3 ; ; 


Let e be the smallest rational number of the form - or 1 — \ such that (2(A / 1 / ) 3 /(A)),,) 2 ) 1 e / 3 < 
A" and (1 + e/3) > log(A")/log((A'/,,) 2 /2A") 

Let c be the largest unit reciprocal less than 1/9 such that all of the following hold: 

(i - C )(a;;») 4 > 4(a , 1 ') 3 

( 2(1 - c )(x;) s /K ') 2 )‘-' /3 < (i - ox? 

(1 + e/3) > log((l - c)Af)/log((l - c)(A;;,) 2 /2A';) 

* 2 (1-c)(A") 2 A / * 2 (1-c)(A",) 2 6 (l-c)(A"„) 4 \ 

_|_ m . 2k'e 16A i(k') 3/2 (W _1/2 +aO j I i _ e i6A , 1 '(fc') 3 / 2 ((«)- 1 /2 +a: ) u ; ' 1 < _ 


Run Unreliable-graph-classification-algorithm(G,c,m,e,x,(X' l ,X' h n)) T(n ) times and 
record the resulting classifications. 

Find the smallest y" such that there exists a set of more than half of the classifications 
no two of which have more than y" disagreement, and discard all classifications not in 
the set. In this step, define the disagreement between two classifications as the minimum 
disagreement over all bijections between their communities. 

For every vertex in G, randomly pick one of the remaining classifications and assert that it is 
in the community claimed by that classification, where a community from one classification 
is assumed to correspond to the community it has the greatest overlap with in each other 
classification. 

Return the resulting combined classification. 
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an integer, e is a rational number of the form \ or 1 — 4 and all of the following hold: 
.9xX,min Pi ( - 9x2x l' minp i 1)\ 1 

2^ e 16A 1 fc 3 / 2 ((minp i ) —1 / 2 +rc) j I ^ g 16Aifc 3 / 2 ((minp i ) —1 / 2 +x) 4 a| | <^ _ 


•9(A£,/2) 4 > Aj 

At > 4A? 

0 < x < j! < 


Aifc 


A*/ min p, 

(2A?/At) 1 - e/3 < A! 

(1 + e/3) > log(Ai)/log(At/2Ai) 

13(2a/(minpj) -1/2 + (a/) 2 ) < min(wi({x}) - Wi({v'})) • P" 1 («; i ({x}) - Wi({v'})) 
every entry of Q k is positive 

3w 6 IR fc such that QPw = X\w,w ■ Pw = 1, and x < ruin w ? ;/2. 
d < min pi 


h' 


[l/5\ ■ (1 - 5) m + m ■ 2[l/5\e iflA 1 Li/«j 3 / 2 (*- 1 / 2 +«) / l - e i8A 1 ii/«j3/*(«- 1 /a+«) 


M)-i) 


l 

<2 


T(n) = u;(l) 
T(n) < ln(n) 


.9a/ 2 A 2 , minp^ 


16A 1 fc 3 / 2 ((minp i ) 1 / 2 +x / ) j ^ g 16A]_fc 3 / 2 ((min) 1 / 2 +a; / ) ^ 4A^ 


mmpi 




With probability 1 — o(l), Reliable-graph-classification-algorithm(G,m,5, T(n)) runs in 
0(m 2 n 1 ~3T(n) + n 1+ z e T(n)) time and classifies at least 1 — 3 y' of G's vertices correctly, 
where 


.9a/ 2 A 2 , minp^ 


/2 x 2 

x X./ min pi 


•j ’ = 2 he 16A i fc3 ^ 2 (( min Pj) 1 t 2 +x') j 1 — e 16A i 


x i>< 


• 9 '3/2 (( /) inp , ) -l/2 +a: , ) '((^|-) T 


Proof. With probability 1 — o(l), Improved-Eigenvalue-approximation-algorithm gives output 
such that h" = h' and |A* — A'| < In~ 3//2 (n) for each i. Assuming that this holds, the 
algorithm finds the largest x and e that satisfy the conditions above and the largest unit 
reciprocal less than 1/9, c, that satisifes the conditions for Unreliable-graph-classification- 
algorithm^, c, m, e, x, (A/,..., A/,,)) to have a greater than 1/2 success rate for any k < [1 /<5J 
with probability 1 — o(l). Since its success rate is greater than 1/2 and T(n ) = cu(l), more 
than half of the executions of Unreliable-graph-classification-algorithm give classifications 
with error y' or less with probability 1 — o(l). That means that y" < 2 y' and at least one 
of the classifications in the selected set must have error y' or less. Thus, all of the selected 
classifications have error 3 y' or less. The requirement that minpj > Ay' ensures that the 
bijection between any two of these classifications’ communities that minimizes disagreeenrent 
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is the identity. Therefore, this algorithm classifies at least 1 — 3 y' of the vertices correctly, 
as desired. 

Assuming this all works correctly, Improved-Eigenvalue-approximation-algorithm runs 
in 0(n ln(n)) time. Finding x, e, and c takes constant time, and running Unreliable-graph- 
classification-algorithm T{n ) times takes 0(m 2 n 1 ~3T(n ) + n 1+ 3 e T(n)). Computing the 
degree of agreement between each pair of classifications takes 0(n ln 2 (n)) time. T(n) < Inn, 
so the brute force algorithm finds y" and the corresponding set in 0(n) time. Combining the 
classifications takes O(n) time. Therefore, this whole algorithm runs in 0(m 2 n 1 ~3T(n) + 
n l+ 3 e T{n)) time with probability 1 — o(l), as desired. □ 

Proof of Theorem 6. If the conditions hold, Reliable-graph-classification- 
algorithm(G,ln(4:[l/8\)/5,d,ln(7i )) has the desired properties by the previous lemma. □ 

7.2 Exact recovery 

Recall that p is a probability vector of dimension k, Q is a k X k symmetric matrix with 
positive entries, and G 2 (n,p,Q) denotes the stochastic block model with community prior 
p and connectivity matrix In (n)Q/n. A random graph G drawn under G 2 (n,p,Q) has a 
planted community assignment, which we denote by a G [k] n and call sometime the true 
community assignment. 

Recall also that exact recovery is solvable for a community partition [k] = U* =1 A S , if 
there exists an algorithm that assigns to each node in G an element of {A\,... ,At} that 
contains its true community' with probability 1 — o n (l). Exact recovery is solvable in 
SBM(n,p, W) if it is solvable for the partition of [k] into k singletons, i.e., all communities 
can be recovered. 

7.2.1 Formal results 

Definition 16. Let p,v be two positive measures on a discrete set X, i.e., two functions 
from X to IR+. We define the CH-divergence between p and v by 

D + (p,v) := max ( tp{x ) + (1 — t)v(x) — p(x) t u(x) 1 ^ t ) . (16) 


Note that for a fixed t, 

M x ) + (! - fM®) - 

x&X 

is an /-divergence. For t = 1/2, i.e., the gap between the arithmetic and geometric means, 
we have 

tii(x) + {l-t)u{x)-p{x) t u{x) 1 - t = h\yfji- sfiu\\l (17) 

x&X 


‘ Up to a relabelling of the communities. 
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which is the Hellinger divergence (or distance), and the maximization over t of the part 
Y^ x /J j ( x ) t ^( x ) 1 ~ t is the exponential of the Chernoff divergence. We refer to Section 8.3 for 
further discussions on D + . Note also that we will often evaluate D + as D + (x,y) where x,y 
are vectors instead of measures. 

Definition 17. For the SBM G 2 (n,p,Q), where p has dimension k (i.e., there are k 
communities), the finest partition of [A] is the partition of [A] in to the largest number of 
subsets such that D+((PQ)i, ( PQ)j ) > 1 for all i,j that are in different subsets. 

We next present our main theorem for exact recovery. We first provide necessary and 
sufficient conditions for exact recovery of partitions, and then provide an agnostic algorithm 
that solves exact recovery efficiently, more precisely, in quasi-linear time. 

Recall that from [AS15] exact recovery is solvable in the stochastic block model G 2 (n,p, Q ) 
for a partition [A;] = u(, =1 A s if and only if for all i and j in different subsets of the partition, 

D+((PQ) i ,(PQ) j ) > 1, (18) 

where ( PQ)i denotes the i-th row of the matrix PQ. In particular, exact recovery is solvable 
in G 2 (n,p, Q) if and only if min iJe[A .] ^ D+((PQ)i, ( PQ)j ) > 1. 

Theorem 7. Let k £ Z + denote the number of communities, p £ (0, l) fc with \p\ = 1 denote 
the community prior, P = diag(p), and let Q £ (0,oo) fcxfc symmetric with no two rows 
equal. For G ~ & 2 {n,p,Q), the algorithm Agnostic-degree-prof iling(G, ^gyr) recovers 
the finest partition, runs in o(n 1+e ) time for all e > 0, and does not need to know the 
parameters (it uses no input except the graph in question). 

Note that the second item in the theorem implies that Agnostic-degree-profiling 
solves exact recovery efficiently whenever the parameters p and Q allow for exact recovery 
to be solvable. 

Remark 1. If Qij = 0 for some i and j then the results above still hold, except that if for 
all i and j in different subsets of the partition, 

D+((PQ) i ,(PQ) j ) > 1, (19) 

but there exist i and j in different subsets of the partition such that D + ((PQ)i, ( PQ)j) = 1 
and {(PQ)i,k ■ (PQ)j,k ■ (( PQ)i t k — (PQ)j,k) = 0 for all k, then the optimal algorithm will have 
an assymptotically constant failure rate. The recovery algorithm also needs to be modified to 
accomodate 0 ’s in Q. 

The algorithm Agnostic-degree-profiling is given in Section 3.1 and replicated be¬ 
low. The idea is to recover the communities with a two-step procedure, similarly to one 
of the algorithms used in [ABH14] for the two-community case. In the first step, we run 
Agnostic-sphere-comparison on a sparsihed version of G 2 (n,p,Q) which has a slowly 
growing average degree. Hence, from Corollary 2, Agnostic-sphere-comparison recovers 
correctly a fraction of nodes that is arbitrarily close to 1 (w.h.p.). In the second step, we 
proceed to an improvement of the first step classification by making local checks for each 
node in the residue graph and deciding whether the node should be moved to another 


50 



community or not. This step requires solving a hypothesis testing problem for deciding the 
local degree profile of vertices in the SBM. The CH-divergence appears when resolving this 
problem, as the mis-classification error exponent. We present this result of self-interest in 
Section 7.2.2. The proof of Theorem 7 is given in Section 7.2.3. 

Agnostic-degree-profiling algorithm. 

Inputs: a graph G = ([n],E), and a splitting parameter 7 G [0,1] (see Theorem 7 for the 
choice of 7). 

Output: Each node v G [n] is assigned a community-list A(v) G {A \,..., A t }, where 
A\ ,..., At is intended to be the partition of [k\ in to the largest number of subsets such that 
D + ((pQ)i, ( pQ)j) > 1 for all i, j in [k] that are in different subsets. 

Algorithm: 

( 1 ) Define the graph G' on the vertex set [n] by selecting each edge in G independently with 
probability 7, and define the graph G" that contains the edges in G that are not in G'. 

(2) Run Agnostic-sphere-comparison on G' to obtain the preliminary classification o' G 
[k] n (see Section 7.1 and Corollary 2.) 

(3) Estimate p and Q based on the alleged communities’ sizes and the edge densities between 
them. 

(4) For each node v G [n], determine in which community node v is most likely to belong to 
based on its degree profile computed from the preliminary classification o' and the estimates 
of p and Q (see Section 7.2.2), and call it o" 

(5) Re-estimate p and Q based on the sizes of the communities claimed by o" and the edge 
densities between them. 

( 6 ) For each node v G [n], determine in which group Ai,... ,At node v is most likely to 
belong to based on its degree profile computed from the preliminary classification o" and 
the new estimates of P and Q (see Section 7.2.2). 

7.2.2 Testing degree profiles 

In this section, we consider the problem of deciding which community a node in the SBM 
belongs to based on its degree profile. The notions are the same as in [AS15], we repeat them 
to ease the reading and explain how the Agnostic-degree-profiling algorithm works. 

Definition 18. The degree profile of a node v G [n] for a partition of the graph’s vertices 
into k communities is the vector d(v) G l\, where the j-th component dj(v) counts the 
number of edges between v and the vertices in community j. Note that d(v) is equal to Ni(v) 
as defined in Definition 8. 

For G ~ G 2 {n,p, Q), community i G [k] has a relative size that concentrates exponentially 
fast to pi . Hence, for a node v in community j. d(v) is approximately given by X^effcl 
where X %3 are independent and distributed as Bin (npi,\n(ri)Qij/n), and where Bin (a,b) 
denotes 8 the binomial distribution with a trials and success probability b. Moreover, the 
Binomial is well-enough approximated by a Poisson distribution of the same mean in this 

s Bin(a, 6) refers to Bin([_aJ, b) if a is not an integer. 
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( 20 ) 


regime. In particular, Le Cam’s inequality gives 

ab 2 ln 2 (n) 

< 2 - 

TV 71 

hence, by the additivity of Poisson distribution and the triangular inequality, 

II Pd(v) ~ V(\n(n) ^ PiQijei)\\ T v = O f . (21) 

ie[ k] \ n J 

We will rely on a simple one-sided bound (see (44)) to approximate our events under the 
Poisson measure. 

Consider now the following problem. Let G be drawn under the G 2 (n,p,Q) SBM and 
assume that the planted partition is revealed except for a given vertex. Based on the degree 
profile of that vertex, is it possible to classify the vertex correctly with high probability? 
We have to resolve a hypothesis testing problem, which involves multivariate Poisson distri¬ 
butions in view of the previous observations. We next study this problem. 


Bin ( na, ^—4 5 j _ -p ( a &ln(n)) 


n 


Testing multivariate Poisson distributions. Consider the following Bayesian hy¬ 
pothesis testing problem with k hypotheses. The random variable H takes values in [fc] with 
P {H = j} = Pj (this is the a priori distribution of H). Under H = j, an observed random 
variable D is drawn from a multivariate Poisson distribution with mean A (j) € R(f, i.e., 

P{D = d\H = j} = V m (d), d&Zl, (22) 

where 

,■)(<*)= n*W*). (23) 

i£[k] 

and 

*w<y = (24) 

In other words, D has independent Poisson entries with different means. We use the following 
notation to summarize the above setting: 

D\H = j ~P(A(j)), j€[k]. (25) 

Our goal is to infer the value of H by observing a realization of D. To minimize the error 
probability given a realization of D, we must pick the most likely hypothesis conditioned on 
this realization, i.e., 

argmax je[fc ]P(L> = d\H = j}pj, (26) 

which is the Maximum A Posteriori (MAP) decoding rule. 9 To resolve this maximization, 
we can proceed to a tournament of k — 1 pairwise comparisons of the hypotheses. Each 
comparison allows us to eliminate one candidate for the maxima, i.e., 

F{D = d\H = i} Vi >F{D = d\H=j} Pj => H + j. (27) 


9 Ties can be broken arbitrarily. 
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The error probability P e of this decoding rule is then given by, 

Pe=J2 P i De Vad(i)\ H = i} Pi , (28) 

ie[fe] 

where Bad(z) is the region in Z+ where i is not maximizing (26). Moreover, for any i E [k], 
P {D E Bad(i)|iZ = i} < Y P {D E Bad lj(i)\H = i} (29) 

where Badj(i) is the region in Z+ where P {D = x\H = i}pi < P {D = x\H = j}pj. Note 
that with this upper-bound, we are counting the overlap regions where P {D = x\H = i}pt < 
P {D = x\H = j}pj for different j’s multiple times, but no more than k — 1 times. Hence, 

Y P{D E Badj(i)\H = i} < (k - 1)P {D G Bad(i)\H = i}. (30) 

Putting (28) and (29) together, we have 

P e < P {D £ Badj(i)\H = i}pi, (31) 

= E E = d \H = i}Pi > p {^ = d\H = j} Pj ) (32) 

i<j d£ l k + 

and from (30), 

Pe ^ E E min ( p {^ = d \ H = p {^ = (33) 

i<j d&l.\ 


Therefore the error probability P e can be controlled by estimating the terms min(P{D = 

d\H = i}pi, P {D = d\H = j}pj)- In our case, recall that 


P{D = d\H = i} = V x{i) (d), 


(34) 


which is a multivariate Poisson distribution. In particular, we are interested in the regime 
where k is constant and A(z) = ln(n)cj, c* E P+, and n diverges. Due to (32), (33), we 
can then control the error probability by controlling J2 x ei k m in('P] n ( ri ) c . ( x )Pi,P\ n ( n ) Cj ( x )Pj)i 
which we will want to be o(l/n) to classify vertices in the SBM correctly with high probability 
based on their degree profiles (see next section). The following lemma provides the relevant 
estimates. 


Lemma 15. For any ci,C 2 E (P+ \ {0}) fc with ci ^ C 2 and pi,p 2 E P+ \ {0}, 

E min Y p Hn)c 1 (x)pi,Vi^ n ) C2 (x)p 2 ) = 0 (n 


-DAcuc *)-££$ 


_—_ / ( A fclnln(n)\ 

E min (^ > ln(n)ci (®)Pl) P\n(n)c2 ^ + ^ ~ 21n(r ° ) , 


(35) 

(36) 


where D + (c\,C 2 ) is the CH-divergence as defined in (16). 
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In other words, the CH-divergence provides the error exponent for deciding among 
multivariate Poisson distributions. We did not find this result in the literature, but found 
a similar result obtained by Verdu [Ver 86 ], who shows that the Hellinger distance (the 
special case with t = 1/2 instead of the maximization over t ) appears in the error exponent 
for testing Poisson point-processes, although [Ver 86 ] does not investigate the exact error 
exponent. This lemma was proven in [AS15]. 

This lemma together with previous bounds on P e imply that if D + (ci,cj ) > 1 for all 
i 7 ^ j, the true hypothesis is correctly recovered with probability o(l/n). However, it may be 
that D+(ci,Cj) > 1 only for a subset of (i,j)-pairs. What can we then infer? While we may 
not recover the true value of H with probability o(l/n), we may narrow down the search 
within a subset of possible hypotheses with that probability of error. 

Testing composite multivariate Poisson distributions. We now consider the 
previous setting, but we are no longer interested in determining the true hypothesis, but 
in deciding between two (or more) disjoint subsets of hypotheses. Under hypothesis 1, the 
distribution of D belongs to a set of possible distributions, namely V{\) where i £ A, and 
under hypothesis 2, the distribution of D belongs to another set of distributions, namely 
V(Xi) where i £ B. Note that A and B are disjoint subsets such that AU B = [k]. In short, 


D\H = 1 ~ V{\i), for some i E A, 
D\H = 2 ~ V(\i), for some i E B, 


(37) 

(38) 


and as before the prior on A i is pi. To minimize the probability of deciding the wrong 
hypothesis upon observing a realization of D, we must pick the hypothesis which leads to 
the larger probability between P{H E A\D = d} and P {H E B\D = d}, or equivalently, 


^V X (i){d)pi >Y V *(%)(d)pi 

i&A i£B 

^2'P\(i){d)pi <^2'P X (i ) (d)p i 


=> H = 1, 


=*> H = 2. 


(39) 

(40) 


i£A 


ieB 


In other words, the problem is similar to the previous one, using the above mixture 
distributions. If we denote by P e the probability of making an error with this test, we have 


P e = m i n £ 'P\{i){x)pi, J ^2Vx(i){x)p i 

\iGA iGB 


(41) 


Moreover, applying bounds on the minima of two sums, 

Pe < £ £ min (P A(i) (x)pi,V XU ) ( x)pj) 

i£A,j£B 

1 


Pe> 


\A\\B\ 




(42) 

(43) 


i£A,jGB 


Therefore, for constant k and A (i) = In(n)cj, c* 6 R(f, with n diverging, it suffices to control 
the decay of Ylxei k + m ^ n (P X (i)( x )Pij P X {j)( x )Pj) when i 6 4 and j 6 B , in order to bound 
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the error probability of deciding whether a vertex degree profile belongs to a group of 
communities or not. 

The same reasoning can be applied to the problem of deciding whether a given node 
belongs to a group of communities, with more than two groups. Also, for any p and p' such 
that |pj — p'j | < In n/yfn for each j. Q, 7 (n), and i, 

^ ' max f £?m/ ; (l — K")) Mb) q \ {%) ~ ~ 0{\/jl ). (44) 

x£Z^ V V ’ " V J 

So, the error rate for any algorithm that classifies vertices based on their degree profile in a 
graph drawn from a sparse SBM is at most 0(l/n 2 ) more than twice what it would be if 
the probability distribution of degree profiles really was the poisson distribution. 

In summary, we have proved the following. 

Lemma 16. Let k G Z + and let A ±,..., A t be disjoint subsets of [k] such that U* =1 Aj = [A;]. 
Let G be a random graph drawn under G 2 (n,p, (1 — 7 (n))Q). Assigning the most likely 
community subset A 4 to a node v based on its degree profile d{v) gives the correct assignment 
with probability 

l — O ( 1— ^( n )) A ~ 5 ln (( 1— 7( n )) ln zi)/\n n , 

where 

A = min min D + ((pQ) il (pQ) j ). (45) 

r .sG[t] z£A r ,j£A s 

r^s 

Moreover, in order to prove Theorem 7, we will need a version of this lemma that 
still holds when some of our information is inaccurate. First of all, consider attempting 
the previous testing procedure when one thinks the distributions are X' with probability 
p' instead of A with probability p. Assume that there exists e such that for all i and j, 
\\'(i)j — \(i)j\ < emin(A'(i)j, X(i)j) and \p\~Pi\ < emin (p'^Pi). Then for any i G A, x G Z+, 
the previous hypothesis testing procedure will not classify x as being in B unless 

5^(1 + e ) N+1 ^A (j)( x )Pj > (1 + e) _|x|_1 P A (i){x) Pi 

j&B 

That means that for any i G A, x G Z^_, the probability that x arises from P\(i) and is then 
misclassihed as being in B is at most 

{i){x)pi <J 2(1 + e) 2 lx]+ 2 V XU) {x) Pj 
jeB 

So, this hypothesis testing procedure has an error probability of at most 

^ (1 + e ) 2 N+ 2 ^ min { v \(i)(x)pi, V x (j)(x)pj) 

x&Z k + i£A,j£B 


Furthermore, 
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Lemma 17 . For any p and Q there exists c such that the following holds for all 5 < minp,/2. 
Let G be a random graph drawn from G 2 (n,p, Q), and a' be a classification of G’s vertices 
that misclassifies dn vertices. Also, let p' be the frequencies with which vertices are classified 
as being in the communities, and Q' be n/lnn times the edge densities between the alleged 
communities. Then with probability 1 — o(l), every element of p' or Q' is within cd + y/lnn/n 
of the corresponding element of p or Q. 

Proof. With probability 1 — o(l), the size of every community is within y/lnn/n of its 
expected value, and the edge density between each pair of communities is within \Jn In n 
of its expected value. Also, there exists a constant d such that with probability 1 — o(l), 
no vertex has degree more than c' In n. Assuming this is the case, the misclassification of 
vertices can not change the apparent number of edges between any two communities by 
more than c 1 5n In n. and it can not change the apparent size of any community by more 
than Sn. Thus, | p\ — pj\ < y/lnn/n + 5 and 


\Q'ij - Qij] < + 4Ah + 4 Q,J Pi + Sp ’ + 


Pi ■ Pj 


Pipj 


□ 


This lets us prove the following “robust” version of this lemma to prove Theorem 7. 

Lemma 18. Let k 6 Z + and let A \,..., A t be disjoint subsets of [k] such that U* =1 Aj = [A;]. 
Let G be a random graph drawn under G 2 (n,p, (1 — 7 (n))Q). There exist c\, 01 , and C 3 
such that with probability 1 — o(l) G is such that for any sufficiently small 5, assigning the 
most likely community subset Ai to a node v based on a distortion of its degree profile that 
independently gets each node’s community wrong with probability at most 5 and estimates of 
p and Q based on a classification of the graphs vertices that misclassifies each vertex with 
probability 6 gives the correct assignment with probability at least 

l _ c 2 ■ (1 + Ci<5) C3hm • ( ( 1— 7( ra )) A - § ln (( 1— 7( n )) Irm)/ inn\ _ fA_ _ 

V / n 2 ’ 

where 


A = min min D + ((pQ)i, (pQ)j). (46) 

r,s6[t] i&A r ,j&A s 

r^s 

Proof. First, note that by the previous lemma, with probability 1 — o(l), G is such that 
every element of the estimates of p and Q is within cd + y/\nn/n of its true value. Choose 
C 3 such that every vertex in the graph has degree less than C 3 In n with probability 1 — o(l). 
If this holds, then the probability of misclassifying a vertex based on its true degree profile 
and the estimates of its parameters is at most 

2(l + 2c<5/min(p i ,(PQ )^)) 2c3lnn+2 • ^ n -(l-7(n))A-i ln((l- 7 (n))lnn)/lnn^ +Q 

based on the previous bounds on the probability that classifying a vertex based on its degree 
profile fails. 
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Now, let 


ci = ma x^Pi'qi'j/lpMj). 

The key observation is that u’s mth neighbor had at least a min i,j(piqi,j)/YlPi'Qi',j chance 
of actually being in community a for each a, so its probability of being reported as being in 
community o is at most 1 + c\ d times the probability that it actually is. So, the probability 
that its reported degree profile is bad is at most (1 + c , 1 5)l JVl WI times the probability that 
its actual degree profile is bad. The conclusion follows. □ 

7.2.3 Proof of Theorem 7 

We prove the possibility result here. The converse was proven in [AS15] for known parame¬ 
ters, hence also applies here. 

Let G ~ G 2 (n,p,Q) and 7 = 7777 , and A\,... ,A t be a partition of [k]. Agnostic-degree- 
prof iling(G,p, Q, 7 ) recovers the partition [A] = l_l(, =1 A s with probability 1 — o n (l) if for 
all i,j in [L] that are in different subsets, 


D + ((PQ)i,(PQ)j)> 1. (47) 

The idea behind Claim 1 is contained in Lemma 16. However, there are several technical 
steps that need to be handled: 

1. The graphs G 1 and G" obtained in step 1 of the algorithm are correlated, since an 
edge cannot be both in G' and G". However, this effect can be discarded since two 
independent versions would share edges with low enough probability. 

2. The classification in step 2 using Agnostic-sphere-comparison has a vanishing 
fraction of vertices which are wrongly labeled, and the SBM parameters are unknown. 
This requires using the robust version of Lemma 16, namely Lemma 18. 

3. In the case where D + ((PQ)i, ( PQ)j ) = 1 a more careful classification is needed as 
carried in steps 3 and 4 of the algorithm. 

Proof. With probability 1 — 0(l/n), no vertex in the graph has degree greater than c^hin. 
Assuming that this holds, no vertex’s set of neighbors in G" is more than 

(1 — m&xqij Inn/n) _C3lnn • (n/(n — C3 lnn)) C3lnn = 1 + o(l) 

times as likely to occur as it would be if G" were independent of G'. So, the fact that they 
are not has negligible impact on the algorithm’s error rate. Now, let 

5 = (e^ 2c 3^ D+({PQ)i,(PQ)j) _l)/ci 

By Lemma 18, if the classification in step 2 has an error rate of at most 5, then the 
classification in step 3 has an error rate of 

0(n" (1 - 7) min ^ D +(( p QM p Q)i)/ 2 + 1 /n 2 ), 
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observing that if a' v / a v the error rate of cr", for v’ adjacent to v is at worst multiplied by a 
constant. That in turn ensures that the final classification has an error rate of at most 


O 


^(1 _|_ ©(n-l 1-7 )™ 11 ^ D +(( p Q)i’( p Q)j)/2 


+ l/rr 2 )) C3lnn — lnra -1 / 4 ^ 
n ) 


O Inn . 


□ 
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